TVMError Check failed Divide by zero when compiling INT8 model

cecilia · July 9, 2020, 11:49am

Issue description

Compiling a pretrained quantized GluonCV model at opt_level 3 fails with an error message about division by zero.

Steps to reproduce the issue

Prepare hardware and environment that meet the requirements for TVM
Install MXNet 1.5.1 or 1.6.0, GluonCV 0.7.0, and the latest MKL-DNN library
Build TVM with USE_MKLDNN ON
Download a pretrained INT8 model from GluonCV with gluoncv.model_zoo_get_model()
Convert the model to a TVM Relay graph with tvm.relay.frontend.from_mxnet()
Compile the graph with with tvm.relay.build() at opt_level 3

What’s the expected result?

What’s the actual result?

Compilation fails with the following TVMError:

File "/usr/tvm/src/tir/op/../../arith/const_fold.h", line 171
TVMError: Check failed: fb->value != 0 (0 vs. 0) : Divide by zero

Additional details

Confirmed for models resnet50_v1_int8, mobilenet1.0_int8, ssd_300_vgg16_atrous_voc_int8 and ssd_512_vgg16_atrous_voc_int8
Confirmed at opt_levels 2, 3 and 4, with the FastMath pass disabled for opt_level 4 on GPU
The offending graph node is a DivNode where the function div(a,b) receives b=0f
If the default value of max_calib_range in _qnn_quantize() in relay/frontend/mxnet.py is hard-coded to 20.0 (arbitrarily picked) then compilation succeeds and inference may be successfully executed; see RuntimeWarning divide by zero encountered in true_divide when converting INT8 model to Relay graph

Suggested solutions

Fix TVM graph conversion of quantized models so as not to cause division by zero at compile time