Hi, as I can see in tvm 0.5 roadmap, it seems int8 quantizer was already ready, am I right?
So can you give us any tutorial or simple examples like mxnet imagenet pretrained model prediction?
1 Like
I think you can subscribe these two links / RFCs.
-
TVM’s own quantization: https://github.com/dmlc/tvm/pull/2116
-
TVM import existing quantized model: https://github.com/dmlc/tvm/issues/2351
I am doing 2. But my fisrt priority is to support TFLite int8 model, because many users like Tensorflow’s training-aware quantization. I will PR TFLite int8 after TFLite FP32 Relay frontend is merged.
1 Like
I’d like to add a +1 for formal documentation and tutorials around TVM’s own quantization.
2 Likes