[RFC] Conv2D padding representation

xyzhou · January 15, 2020, 12:45am

After this PR merged, workload was changed as length of padding attribute in conv2d changes from two to four in asymetric padding workloads.

It turns out the Tophub schedules do not match the workloads with padding attribute length 4 (even for symetric padding), so we have to update Tophub to cover length-4-padding workloads. Intuitively, we could just add asymetric padding workloads to Tophub so that we have both length-2-padding and length-4-padding workloads. However, this solution has two issues:

Updating Tophub cannot completely resolve this problem. Since the model from other frontends (e.g., MXNet) still use length-2-padding, we may need to tune the same workload twice (length-4-padding and length-2-padding).
Symetric padding workloads from TF model still have to be re-tuned as padding is fused into conv2d. Previously in TF frontend, padding values is converted to be a padding operator prior to the conv2d operator so the padding of some conv2d ops from TF models are (0, 0). It was now fused into conv2d as other frontends.

In this RFC, we propose several solutions to fully resolve this issue:

Approach 1: Change all frontend to generate lengh-4-padding workloads.

Pros: easy to modify and straightforward.
Cons: there’s no strict rule to enforce this policy, so it might be violated again in the future.

Appraoch 2: Propose a new pass or intergete it into an existing pass to tranfrom the padding length of all cov2d workloads to 4.

Pros: This is a general solution for all frontends and AutoTVM.
Cons: 1) Add additional overhead and complexity to the pass. 2) It might be a little bit ad-hoc.

Approach 3: When AutoTVM is extracting tasks, it unifies the length of all conv2d padding to 4.

Pros: We only need to modify AutoTVM task extraction without touching any other parts.
Cons: This is even more ad-hoc.

Any comments and thoughts are welcome:)

cc @tqchen @yzhliu @merrymercy @kevinthesun @comaniac @haichen

FrozenGene · January 15, 2020, 2:20am

I prefer approach 1.

This PR’s goal is to support asymmetric padding, so in fact, we should modify frontend to leverage it. At far as I know, our Keras, Tensorflow, TFLite, CoreML frontend should modify.
Add new frontend is not frequently, meanwhile we have the mechanism of reviewing. When we review the code, we should notice it. According to my experience, when we have unified style, others will follow it. If we have 2+ style, they will choose one.

comaniac · January 15, 2020, 2:47am

Vote for approach 2.

I agree with @xyzhou that approach 1 has no enforcement and it could be an issue in long term even for existing frontends. As the number of reviewers will increase over time, it’s hard to expect all reviewers to notice this issue. Plus, this is not an issue for the build flow itself but only an issue for AutoTVM to match configs. It means violating this issue will not trigger any CI failures, and we need to repeat this flow to identify the reason when we notice performance regression.

Approach 3 is also considerable, but I am concerned about the inconsistency between model workload and extracted tasks.

FrozenGene · January 15, 2020, 3:39am

Hmm…in fact, like TF / TFLite / Keras / CoreML’s frontend asymmetric padding should be changed now (TF frontend is changed now), We could do attr['padding'] = (pad_t, pad_l, pad_d, pad_r), no nn.pad operator before convolution. This work only need to do one time in fact and only for convolution, we won’t touch it anymore in the future. So, I don’t think it is a problem for existing frontend. Again, when we have one unified style of padding, new frontend will follow this code style. If we have 2+ code styles, new contributors doesn’t know which one is better, they will choose one. We shouldn’t try to push everything into Relay pass / low level mechanism. We should do it in more suitable place.

comaniac · January 15, 2020, 4:28pm

That’s sort of a fair statement. Accordingly I would suggest adding one more modification to approach 1, which is enforcing (pad_t, pad_l, pad_d, pad_r) in TOPI conv2d argument. One reason for this issue is because TOPI conv2d padding argument is too flexible. If we change the conv2d spec to only allow 4-way padding (e.g., enforce Tuple[int, int, int, int] or len(padding) == 4), then we have to modify frontends to pass the CI.

FrozenGene · January 15, 2020, 4:42pm

Of course, we could do it and I agree this is better (One benefit of keeping 2D padding will let us keep old behavior and don’t need to change existing testing code). Current pooling op also do like this.(support 4D or 2D, maybe could be unified like convolution, if we decide to only support 4D)

comaniac · January 15, 2020, 4:47pm

Right. I think it is mainly what this RFC is trying to resolve. If supporting both 2-way and 4-way will not cause any problem then it’s definitely fine, but since the problem appears and already affects some users, it should be better to unify all attributes with the potential same issue.

FrozenGene · January 15, 2020, 4:52pm

I agree it. We could unify padding.

haichen · January 15, 2020, 9:27pm

An alternative approach is that we modify the relay.nn.conv2d python implementation and legalize the padding. So we don’t need to modify every frontend converter.

xyzhou · January 16, 2020, 9:50pm

It seems we have different opinions on both approach #1 and #2. I want to see if there are any more comments about this.

cc @tqchen @yzhliu @merrymercy @kevinthesun @wweic

merrymercy · January 20, 2020, 11:36pm

I prefer approach #1. Haichen’s suggestion is also okay.

Combining them, we can do two things:

Enforce topi.nn.conv2d to accept 4-way padding by inserting assertion, so users won’t break the rule. Since autotvm uses argument of topi function as workload for matching, the arguments format of topi function should always be a unique normal form.
Use Haichen’s suggestion to deal with frontend converters.

This bugs is very critical. It breaks out our benchmark. Please fix it as soon as possible.

comaniac · January 21, 2020, 2:22am

So you suggested the following flow:

Frontend converter generates conv2d (maybe 2-way or 4-way padding).
relay.nn.conv2d legalizes the padding to 4-way.
topi.nn.conv2d only accepts 4-way padding.

Due to the assertion in 3, AutoTVM conv2d workloads should always be 4-way padding. This solution seems promising to me.

kevinthesun · January 21, 2020, 7:57am

Agree in topi we should enforce 4d padding. In relay we can support 1d, 2d or 4d padding, which allows frontend converters to be unmodified.

xyzhou · January 21, 2020, 2:03pm

Thank you all for your comments and thoughts. The suggested flow sounds good to me. I will implement legalization relay.nn.conv2d padding to 4 way and make topi.nn.conv2d only accepts 4 way padding. I will submit a PR soon.