Quantization fails with recent master commits of TVM

cecilia · July 14, 2020, 5:51am

Issue description

Quantization of some pretrained GluonCV models fails with TVMError “Check failed” in TVM built from master commit aa808570 or later.

Steps to reproduce the issue

Prepare hardware and environment that meet the requirements for TVM
Install MXNet 1.5.1 or 1.6.0, GluonCV 0.7.0, and the latest MKL-DNN library
Build TVM master commit aa808570 or later with USE_MKLDNN ON
Download pretrained model ssd_512_vgg16_atrous_voc from GluonCV with gluoncv.model_zoo.get_model()
Convert the model to a TVM Relay graph with tvm.relay.frontend.from_mxnet()
Quantize the model with tvm.relay.quantize.quantize()

What’s the expected result?

Quantization succeeds

What’s the actual result?

Quantization fails with the following TVMError:

File "/usr/tvm/src/relay/quantize/realize.cc", line 408
TVMError: Check failed: new_args.size() == 1 (4 vs. 1) :

Additional details

The error occurs in the “realize” part of quantization, in the function IdentityRealize
The error occurs because the argument new_args to IdentityRealize has length 4 instead of the expected 1, causing the corresponding check to fail
Confirmed for models ssd_512_vgg16_atrous_voc and yolo3_darknet53_voc
Confirmed for TVM commits aa808570, 151f3f5a and 9f7745e7
Quantization succeeds for the model resnet50_v1
Quantization succeeds for the TVM master commits 43dcbc6b and 0ea99698, which are earlier than aa808570, and for the TVM tag v0.6.1.rc1

Suggested solutions

Restore support in TVM for quantization of the GluonCV pretrained models currently unsupported

jenst · July 29, 2020, 5:58pm

I’m experiencing the same error when running Tiny BERT (from Google repo: https://github.com/google-research/bert) through this quantization step. Any help or insight would be appreciated.

G4V · August 14, 2020, 1:39pm

Hi @cecilia and @jenst.

I’m seeing the same problem when quantizing a BERT model. Either of you found a solution?

Many thanks in advance.

jenst · August 14, 2020, 3:11pm

Hi @G4V, for me this was caused by the strided_slice operation in the BERT model, which takes 4 arguments instead of 1 and hence fails the check in quantize/realize.cc To get around this I just commented out the check for now and rebuilt, which works fine for me.

I think alternatively you could write a custom StridedRealize function that’s basically identical to IdentityRealize but then checks for four arguments instead, and then register the strided_slice operation with StridedRealize.

Hope that helps!

G4V · August 14, 2020, 4:31pm

Brilliant! Thanks @jenst. Will give the workaround a try. Sounds like the path of least resistance.