Layout conversion pass


#1

Right now we use AlterOpLayout pass that automatically decides which layout based on the target hw back ends.

Given that we also want to offer the flexibility to pragmatically add pass pipelines, and there has been increasing need for converting between layouts(e.g. NHWC to NCHW), we might want to also introduce a Layout conversion pass that a user can specify. This would provide additional optional flexibility that some of our current frontends need Example usage:

mod_nhwc = relay.from_tflite(model)
mod_nchw = relay.transform.ConvertLayout("NCHW", "NHWC")(mod_nhwc)

Let us use this thread to discuss the API choices, possible implementation problems.

cc @yzhliu @FrozenGene @merrymercy @zhiics


Converting a whole model from NHWC to NWHW in Relay
#2

That would be very convenient for a lot of reasons (e.g. for now X86 and ARM is lacking NHWC layout support).

One question regarding the layout - what are the pros/cons of explicitly specifying the layout with a relay transform pass, v.s. inferring it from the layout of the input shape (when using most relay importers, we’ll need to pass a shape dict; would there be a benefit from enforcing the layout from the shape?)


#3

This will be very useful feature. Not sure if its clean, but this can also bind in very well with target-specific AlterOpLayout pass, where we do not have to handle all different types of layout transforms.

I like the overall API that is mentioned. I think the implementation might be little more complicated that might require many individual passes (very similar to AlterOpLayout infra). For example,

conv(layout="NHWC")
sum(axis=3)
conv(layout="NHWC")
sum(axis=3)

needs a mechanism to mutate sum attributes so that we can have

transpose(NHWC->NCHW)
conv(layout="NCHW")
sum(axis=1)
conv(layout="NCHW")
sum(axis=1)
transpose(NCHW->NHWC)

#4

Great discussions so far, would be great to see if we can have someone who can volunteer to lead the charge.


#5

For the layout transformation, I think we should care one thing is how we do the work. For example, current TF frontend do this work just be:
insert transpose(NHWC->NCHW) -> conv(NCHW)-> insert transpose(NCHW->NHWC)
I think we don’t want this way because of transpose cost.

Next, I want to express some point in frontend / backend

  • Frontend

So if we want one Relay pass to do it , we should be careful how long the transpose path could be exist? The ideal path is:

mod_nchw = relay.transform.ConvertLayout("NCHW", "NHWC")(mod_nhwc)

->

input (NHWC) -> insert transpose(NHWC->NCHW) -> model ops -> insert transpose (NCHW->NHWC)->output
That is to say we just insert two transpose at the begin / end place so that the users could use the model like NHWC without any difference. We could discuss whether we should make output be back to NHWC.

However, we also must careful some ops, for example squeeze / reshape and so on, original model’s axis information is just NHWC.

  • Backend

We have different backend, such as x86, arm, NV GPU, Intel GPU, arm mali. NCHW doesn’t be the best layout for all backends. For example, we observe NHWC maybe one better layout for arm cpu. When we load TFLite model, how do we set the data layout? Proving one parameter like TF frontend (from_tensorflow) or what else?


#6

exactly, the layout transformation pass need to do some heavy lifting to avoid doing excessive transformations.

For each frontend, I think we could start with a default layout used by the frontend, and allow user to pragmatically call the ConvertLayout pass to change the layout to the desired one.


#7

I’m working on the layout inference pass (though the progress was interrupted time by time)
Regarding @thierry 's point of enforcing layout from the shape, I also would like to make it depend on InferShape pass, as in some cases shape are required, e.g., broadcast