Quantization accuracy drop with kl divergence

I’m runing int8 quantization with kl divergence and skip the first conv layer on resnet50, but the top1 accuracy is only 46%. when I use max value calibration, the top1 accuracy can achieve 74.5%, why kl divergence make the top1 accuracy decline so much?@ZihengJiang

I found the top1 accuracy falling is mainly caused by Operator fusion(_transform.SimplifyInference()), if I nouse Operator fusion and then use kl divergence calibration, the top1 accuracy is 73.5%, do tvm has Official test results about quantization?

Can you post your entire quantization config?

“nbit_input”: 8, “nbit_weight”: 8, “nbit_activation”: 32, “dtype_input”: “int8”, “dtype_weight”: “int8”, “dtype_activation”: “int32”, “calibrate_mode”: “kl_divergence”, “global_scale”: 8.0, “weight_scale”: “max”, “skip_conv_layers”: [0], “round_for_shift”: True, “debug_enabled_ops”: None, “rounding”: “UPWARD”, “calibrate_chunk_by”: -1,

I found the accuracy of version 0.7 is close to zero when use the default setting, what cause it? Am I using the wrong configuration? @adb, @vinx13

1 Like

I also face the problem that the TOP1 accuracy for resnet50 drops from 76.25 % (FP32 baseline) to a value between 50 and 55 % if I use quantization with kl_divergence and calibration. In contrast if I use ‘global_scale’ mode with global_scale=4, I obtain a TOP1-accuracy of 75.61 %.

Does anyone know what the reason could be?

Are there any updates on this issue? I have checked again with Apache TVM (incubating) v0.7.0 but I still face the same problem.