Per-axis quantization support for TFLite

stoa · May 19, 2020, 11:01am

Hello, I am working at ST Microelectronics and am evaluating the TVM technology in our environment. I have noticed that in TVM 0.6 version the per-axis .tflite quantized models, such as generated by TensorflowLite, are not supported. Is anybody working on adding the support for such models. How can we contribute ? cheers Arthur

ramana-arm · June 3, 2020, 10:28pm

Hello there,

Welcome to the community ! AFAIK, there is nothing in place for signed int8 symmetric quantization support in the tflite frontend yet even in master : however I believe the underlying codegeneration framework can support it with the qnn dialect of relay based on this [QNN] Channel wise quantization - Quantize and Requantize

It’s certainly of interest to me and my team but we haven’t done much with it beyond some investigations this week. I’m not aware of any work in this space in the community at the moment, maybe this will interest some folks.

@janimesh - any thoughts ?

Ramana

masahi · June 4, 2020, 1:43am

For per-channel weight quantization, it is fully supported. I don’t know much about TFLite frontend, but our pytorch frontend fully supports per channel quantization.

This tutorial demonstrate importing per-channel quantized pytorch model.

https://docs.tvm.ai/tutorials/frontend/deploy_prequantized.html#sphx-glr-tutorials-frontend-deploy-prequantized-py

ramana-arm · June 4, 2020, 2:00pm

Thanks that sounds like it should be relatively straightforward to integrate.

Ramana

stoa · June 8, 2020, 11:00am

Thank you Ramana, Masahi Basically, there should be enough support already for being able to use the TFLite quantized models. That’s good enough for us at this point. My around March version of the TVM master must not have included the per-axis RFC work yet. Thanks a lot for your reply. cheers Arthur