Tuning for int8 quantized config log too slow

herbiezhao · July 5, 2019, 3:27am

Use tf.compat.v1.graph_util.extract_sub_graph
…100%, 0.02 MB, 1 KB/s, 16 seconds passed
Tuning…
[Task 1/23] Current/Best: 72.31/ 823.37 GFLOPS | Progress: (2000/2000) | 40238.38 s Done.
[Task 2/23] Current/Best: 5.33/1250.61 GFLOPS | Progress: (448/2000) | 13866.37 s

I want to tune resnet50 model for int8 quantized config log, but the tuning process is too slow.
Is it normal？

jwfromm · July 5, 2019, 6:40pm

Your search space is a little too large, I’ve found that you can get excellent results after just 100 or 200 trials compared to your 2000.

herbiezhao · July 8, 2019, 4:16am

Thanks，I will try it.

aa12356jm · July 9, 2019, 6:48am

how to use int8 quantized? where is the tutorial? thanks

herbiezhao · July 9, 2019, 7:31am

Could you give me a sample code for tuning int8 quantized config?
I had got a int8 quantized config, but performance has not improved, just like float32.
And the config log is lack of the first conv2d (‘conv2d’, (1, 3, 230, 230, ‘int8’), because I just got conv2d (‘conv2d’, (1, 3, 230, 230, ‘float32’) after tuning.

herbiezhao · July 9, 2019, 7:35am

please refer to https://github.com/vinx13/tvm-cuda-int8-benchmark

aa12356jm · July 9, 2019, 8:13am

tvm int8 quantize support arm?