Tuning for int8 quantized config log too slow


#1

Use tf.compat.v1.graph_util.extract_sub_graph
…100%, 0.02 MB, 1 KB/s, 16 seconds passed
Tuning…
[Task 1/23] Current/Best: 72.31/ 823.37 GFLOPS | Progress: (2000/2000) | 40238.38 s Done.
[Task 2/23] Current/Best: 5.33/1250.61 GFLOPS | Progress: (448/2000) | 13866.37 s

I want to tune resnet50 model for int8 quantized config log, but the tuning process is too slow.
Is it normal?


#2

Your search space is a little too large, I’ve found that you can get excellent results after just 100 or 200 trials compared to your 2000.


#3

Thanks,I will try it.


#4

how to use int8 quantized? where is the tutorial? thanks


#5

Could you give me a sample code for tuning int8 quantized config?
I had got a int8 quantized config, but performance has not improved, just like float32.
And the config log is lack of the first conv2d (‘conv2d’, (1, 3, 230, 230, ‘int8’), because I just got conv2d (‘conv2d’, (1, 3, 230, 230, ‘float32’) after tuning.


#6

please refer to https://github.com/vinx13/tvm-cuda-int8-benchmark


#7

tvm int8 quantize support arm?