[Quantization][AutoTVM] Performance degradation of Quantization + AutoTVM vs FP on raspberry Pi 3B

zhangliliang · February 24, 2020, 4:57am

Hi,

I am try to use quantization on mobilenetv1 model on the raspberry pi 3B, but found it is slower than the float version. And it is much slower than the TFLite counterpart using quantization.

More information is given as below.

Environments: raspberry pi 3B, aarch64, arm A53 1.2Ghz, single core

Performance:

AutoTVM float: 376 ms
AutoTVM quantization: 426 ms
TFlite float: 453 ms
TFlite quantization: 237 ms

Any suggestions for improve the performance?