Quantization accuracy drop with kl divergence

xiwang · February 11, 2020, 6:36am

I’m runing int8 quantization with kl divergence and skip the first conv layer on resnet50, but the top1 accuracy is only 46%. when I use max value calibration, the top1 accuracy can achieve 74.5%, why kl divergence make the top1 accuracy decline so much?@ZihengJiang

xiwang · February 12, 2020, 6:25am

I found the top1 accuracy falling is mainly caused by Operator fusion(_transform.SimplifyInference()), if I nouse Operator fusion and then use kl divergence calibration, the top1 accuracy is 73.5%, do tvm has Official test results about quantization?

adb · February 13, 2020, 5:39pm

Can you post your entire quantization config?

xiwang · February 14, 2020, 12:42am

“nbit_input”: 8, “nbit_weight”: 8, “nbit_activation”: 32, “dtype_input”: “int8”, “dtype_weight”: “int8”, “dtype_activation”: “int32”, “calibrate_mode”: “kl_divergence”, “global_scale”: 8.0, “weight_scale”: “max”, “skip_conv_layers”: [0], “round_for_shift”: True, “debug_enabled_ops”: None, “rounding”: “UPWARD”, “calibrate_chunk_by”: -1,

xiwang · March 3, 2020, 6:41am

I found the accuracy of version 0.7 is close to zero when use the default setting, what cause it? Am I using the wrong configuration? @adb, @vinx13

chrschinab · July 9, 2020, 2:07pm

I also face the problem that the TOP1 accuracy for resnet50 drops from 76.25 % (FP32 baseline) to a value between 50 and 55 % if I use quantization with kl_divergence and calibration. In contrast if I use ‘global_scale’ mode with global_scale=4, I obtain a TOP1-accuracy of 75.61 %.

Does anyone know what the reason could be?

chrschinab · February 23, 2021, 10:08am

Are there any updates on this issue? I have checked again with Apache TVM (incubating) v0.7.0 but I still face the same problem.