Quantization broken due to PR #3135


There are some issues in quantization due to changes in #3135.

This assertion seems not necessary.

A use case is that, lhs is quantized (int8), and rhs is constant.
For example, in resent we have add(max_pool(…), bias). Since max_pool is quantized, lhs of add is int8. In this case, during realize stage we need to convert both lhs and rhs to int32 and then perform the addition.

After removing this assertion we can run the annotate pass successfully. But other changes are still needed in realize.

In UnifyDTypeScale:

Conditions here do not cover every case. In the example above, lhs is int8 and rhs is float.

This can be reproduced with:

import tvm
import tvm.relay as relay
import tvm.relay.testing

net, params = relay.testing.resnet.get_workload(1)
relay.quantize.quantize(net, params = params)

@ziheng @thierry


Looking forward for the updates on this!


Thank you for the catch @zhiics. I believe this stresses the importance of extended unit test coverage for quantization passes. Do you want to go ahead and issue a PR fix? If VTA quantization is broken, it will be caught when we build the sphinx gallery.


Okay I will send a patch


I have fixed the issue here in https://github.com/vinx13/tvm/commit/138b0d5f5d3054c33a356daa24bc16278adb890c

One more question:
Should I recover these lines?

The use case is add(conv, conv), where rhs is a shortcut. These lines transform rhs to cast(i32, cast(i8, qconv)). I found VTA is doing similar thing using ForceCast, maybe we can unify this?
@ziheng @thierry


Unifying makes sense; this would mean that we wouldn’t need to rely on ForceCast anymore for VTA quantization? It makes sense to keep annotations limited.


Yes we can have a unified way to do this.
In add_vta_rewrite realize is explicitly called which produces ForceCast + StopFusion. This is the same as what we did in UnifyDTypeScale that used simulated_quantize annotation.


Patch for this: https://github.com/dmlc/tvm/pull/3534