[TF] Kernel Layout: HWIO vs. HWOI


#1

In our NNVM/Relay model importers, we appear to be using a double layout standard when it comes to kernel layout:

It appears that we force the HWOI format when we encounter a depthwise_conv2d - however TF documentation makes it clear that the format should be [filter_height, filter_width, in_channels, channel_multiplier] as https://www.tensorflow.org/api_docs/python/tf/nn/depthwise_conv2d

I propose to fix this current issue and enforce HWIO across the board when we import a FT model by default. This problem was hinted at in a former discuss forum post: Issue in depthwise convolution


#2

@srkreddy1238 @tqchen @zhiics @yzhliu @grwlf @siju-samuel @apivovarov


#3

Should be resolved by https://github.com/dmlc/tvm/pull/3676/files


#4

And I think it should not be problem. This is the issue of our impl. We don’t handle correctly between depthwise conv and grouped conv before this pr. In fact, tflite also has two different kernel layout for conv / depthwise conv. The special place of depthwise conv kernel is we have KH KW C M, the correct if strictly speaking, kernel layout should be HWOM.


#5

Thanks for the clarification @FrozenGene; but correct me if I’m wrong; shouldn’t HWOI be interpreted in the case of depthwise conv as HWMC which contradicts the TF layout requirements?

I found these lines: https://github.com/dmlc/tvm/blob/master/python/tvm/relay/frontend/tflite.py#L672-L674 to be in contradiction with these lines: https://github.com/dmlc/tvm/blob/master/python/tvm/relay/frontend/tflite.py#L656-L659


#6

@thierry TFLite’s depthwise conv kernel layout is not the same as TF. They are different. Normal conv kernel layout of TFLite is also different with TF.

For TFLite, we obey the definition here: https://github.com/dmlc/tvm/blob/master/topi/python/topi/nn/depthwise_conv2d.py#L131. As you say in NCHW https://github.com/dmlc/tvm/blob/master/topi/python/topi/nn/depthwise_conv2d.py#L60, For NCHW, we always OIHW. However, but for depthwise convolution, the O value equals to input tensor’s channel, i.e. ic.

P.S.
I must apologize that my TFLite’s implementation for depthwise convolution is not correct for depthwise convolution’s multiplier greater than 1, TFLite’s depthwise convolution kernel layout is not I commented there in fact(TFLite depthwise kernel layout is 1xKHxKWx(input_channel * multiplier). The issue I have known when I handle https://github.com/dmlc/tvm/pull/3676. I will support TFLite depthwise convolution multiplier greater than 1 in several days. But again, TFLite’s conv / depthwise convolution kernel layout is not the same as TF.


#7

Thank you for the explanation; I must apologize for not making it clear that I was referring initially to TFLite instead of TF; I did not realize that the layout requirements were different between the two frameworks.

And fixing the dephwise convolution would be great, I’ll keep an eye out for the PR!


#8

Also, perhaps as a way to prevent further confusion, what do you think of the following naming convention for dephwise convolution: HWIM and MIHW?


#9

Yes. I think so. However, the question is worthy doing it or we could reuse current way and comment what happened.