[Quantization][AutoTVM] Performance degradation of Quantization + AutoTVM vs FP on raspberry Pi 3B

Hi,

I am try to use quantization on mobilenetv1 model on the raspberry pi 3B, but found it is slower than the float version. And it is much slower than the TFLite counterpart using quantization.

More information is given as below.

Environments: raspberry pi 3B, aarch64, arm A53 1.2Ghz, single core

Performance:

  • AutoTVM float: 376 ms
  • AutoTVM quantization: 426 ms
  • TFlite float: 453 ms
  • TFlite quantization: 237 ms

Any suggestions for improve the performance?