Quantization on x86 with AVX is slower than without quantization


#1

Hi,

I have been testing quantization + AVX2 with the tensorflow inception model from the tutorial (https://docs.tvm.ai/tutorials/frontend/from_tensorflow.html) on a x86 machine.

I found out that with quantization the performance is worse. Without quantization I get an inference time of 71.09ms, but with quantization the inference increases to 166.18ms.

Why is this the case? Does TVM properly support quantization and AVX2 on x86?

Thanks