Quantization fails with recent master commits of TVM

Issue description

Quantization of some pretrained GluonCV models fails with TVMError “Check failed” in TVM built from master commit aa808570 or later.

Steps to reproduce the issue

  1. Prepare hardware and environment that meet the requirements for TVM
  2. Install MXNet 1.5.1 or 1.6.0, GluonCV 0.7.0, and the latest MKL-DNN library
  3. Build TVM master commit aa808570 or later with USE_MKLDNN ON
  4. Download pretrained model ssd_512_vgg16_atrous_voc from GluonCV with gluoncv.model_zoo.get_model()
  5. Convert the model to a TVM Relay graph with tvm.relay.frontend.from_mxnet()
  6. Quantize the model with tvm.relay.quantize.quantize()

What’s the expected result?

  • Quantization succeeds

What’s the actual result?

  • Quantization fails with the following TVMError:

    File "/usr/tvm/src/relay/quantize/realize.cc", line 408
    TVMError: Check failed: new_args.size() == 1 (4 vs. 1) :
    

Additional details

  • The error occurs in the “realize” part of quantization, in the function IdentityRealize
  • The error occurs because the argument new_args to IdentityRealize has length 4 instead of the expected 1, causing the corresponding check to fail
  • Confirmed for models ssd_512_vgg16_atrous_voc and yolo3_darknet53_voc
  • Confirmed for TVM commits aa808570, 151f3f5a and 9f7745e7
  • Quantization succeeds for the model resnet50_v1
  • Quantization succeeds for the TVM master commits 43dcbc6b and 0ea99698, which are earlier than aa808570, and for the TVM tag v0.6.1.rc1

Suggested solutions

  • Restore support in TVM for quantization of the GluonCV pretrained models currently unsupported
1 Like

I’m experiencing the same error when running Tiny BERT (from Google repo: https://github.com/google-research/bert) through this quantization step. Any help or insight would be appreciated.

Hi @cecilia and @jenst.

I’m seeing the same problem when quantizing a BERT model. Either of you found a solution?

Many thanks in advance.

Hi @G4V, for me this was caused by the strided_slice operation in the BERT model, which takes 4 arguments instead of 1 and hence fails the check in quantize/realize.cc To get around this I just commented out the check for now and rebuilt, which works fine for me.

I think alternatively you could write a custom StridedRealize function that’s basically identical to IdentityRealize but then checks for four arguments instead, and then register the strided_slice operation with StridedRealize.

Hope that helps!

Brilliant! Thanks @jenst. Will give the workaround a try. Sounds like the path of least resistance. :slight_smile:

1 Like