I have been testing quantization + AVX2 with the tensorflow inception model from the tutorial (https://docs.tvm.ai/tutorials/frontend/from_tensorflow.html) on a x86 machine.
I found out that with quantization the performance is worse. Without quantization I get an inference time of 71.09ms, but with quantization the inference increases to 166.18ms.
Why is this the case? Does TVM properly support quantization and AVX2 on x86?