Why we only support kernel (1,1) for TF conv2d_transpose SAME?

I noticed that we only support kernel (1,1) for TF conv2d_transpose with SAME padding. test_forward.py L381

test_forward.py L385

PR 4300

I think we can support conv2d_transpose with SAME padding for any kernels. We just need to run conv2d_transpose with VALID padding (0, 0) and then de-pad the output.

I prepared SAME and VALID outputs for kernel (3,3) below. As you can see the middle part of VALID output is the same as SAME output. Looks like in order to support SAME padding with non-(1,1) kernel we just need to de-pad (slice) VALID conv2d_transpose output.

conv2d_transpose SAME output (1,3,3,2)

[[[[1.2143719 1.3307922]
   [2.3303804 2.4103665]
   [2.0152621 1.9104322]]

  [[2.0949996 2.192451 ]
   [3.8877935 3.8518226]
   [3.222784  2.938037 ]]

  [[1.5925353 1.5517256]
   [2.834947  2.6255574]
   [2.238522  1.9052962]]]]

conv2d_transpose VALID output (1,5,5,2)

   [[[[0.17881976 0.22647332]
      [0.5173658  0.60219085]
      [1.036694   1.134966  ]
      [0.9461776  0.9461233 ]
      [0.60248697 0.5678561 ]]

     [[0.44633663 0.52967083]
      [1.2143719  1.3307922 ]
      [2.3303804  2.4103665 ]
      [2.0152621  1.9104322 ]
      [1.2389357  1.1096866 ]]

     [[0.8069978  0.9107976 ]
      [2.0949996  2.192451  ]
      [3.8877935  3.8518226 ]
      [3.222784   2.938037  ]
      [1.9267566  1.6641226 ]]

     [[0.65010977 0.67964244]
      [1.5925353  1.5517256 ]
      [2.834947   2.6255574 ]
      [2.238522   1.9052962 ]
      [1.2932158  1.0435019 ]]

     [[0.38155985 0.3777752 ]
      [0.8960915  0.830614  ]
      [1.54699    1.3679438 ]
      [1.1761999  0.9574436 ]
      [0.66193414 0.51196814]]]]

It seems the current conv2d_transpose is not implemented this way. The SAME padding is achieved by

  1. dilatied the input with zeros
  2. padding inputs according to the expected output shape
  3. transpose and convolution. Please refer to the implementation. I guess it is following this, but I am not quiet sure since I haven’t joined this project long time enought.

If we use kernel 3x3 and the input 8x8 then transpose_conv2d output size will be 10x10.

If you add more padding to the input the output will be even bigger. Are you suggesting to apply negative padding to the input to make the preprocessed input to be 6x6 and the resulting output will be 8x8?

In any case please add _test_convolution for 3x3 kernel and SAME padding. Looks like you changed kernel to 1x1 in order for SAME padding tests to pass.

Yes. That’s why using kernel more than 1x1 doesn’t work in current TVM implementation. Even set pad=0 and stride=1, the output size is still more than the input size. There is no way to get ‘SAME’ size in this condition. I am still in the progress to understand why TVM implemented this function this way, which is not compitable with TF.

Negative padding to the input can reverse the extra padding introduced by this. Maybe it is also worth to try.

I would add this case if there would be a TF model which need ‘SAME’ output size with more than 1x1 size kernel. This frontend funtion is added recently by me. I only design the testcase based on my consideration. I didn’t change anything. Sorry if this caused any confuse.

For now I’d open PR to make kernel size check and show users message saying that “Currently we support SAME padding for kernel with size 1x1 only”

I raised PR #4484 to fix this.