Auto-tuner for quantized model

HuiFeng · April 17, 2019, 10:19am

Whether autotvm support quantized model for x86 ？I want to Auto-tuning a quantized model like in the tutorials,code like this

quantize

net_quant = relay.quantize.quantize(net, params)

lanuch task

tasks = autotvm.task.extract_from_program(net_quant,target=target, params=params,ops=(relay.op.nn.conv2d,))

load log file

with autotvm.apply_history_best(log_file):
with relay.build_config(opt_level=3):
graph_quant, lib_quant, params_quant = relay.build_module.build(net_quant, target=target, params=params)

error:

KeyError: ‘tile_c’ ,Error during compile func

please help me

eqy · April 17, 2019, 7:07pm

Currently this is a bit tricky because the quantize pass does not work with alter_op_layout enabled in opt_level=3.

What are the extracted tasks, and what does it try to tune?

HuiFeng · April 18, 2019, 6:53am

I attemp to tune the ops (conv2d, dense, conv2d_transpose) . i find that when i load log_file no matter what opt_level i set, same error will happen like above “KeyError: ‘tile_c’ ,Error during compile func”. why?

eqy · April 18, 2019, 6:57am

This is likely because of a template mismatch where the alter_op_layout transforms operators from conv2d to conv2d nchwc and the configs do not match the template. Can you print the graph just before tuning and just before compile to check this?

HuiFeng · April 18, 2019, 8:09am

there is no error when i use the nnvm to tune and build with the autotune log_file, but relay can not . Is there relay have something wrong? I also checked the graph before tune and after tune. because i want to autotune quantized mode, nnvm does not seem to support quantitative model.

eqy · April 18, 2019, 8:08am

It is possible, can you print the relay graph before and after tuning?

HuiFeng · April 18, 2019, 8:17am

the net load from relay.frontend.from_onnx

the net after autotvm.task.extract_from_program tune and before autotvm.apply_history_best build

Sorry , the graph is too big, i can show you enough

HuiFeng · April 18, 2019, 8:23am

There is no problem if i not use autotune and autotvm.apply_history_best, just use the relay.build_config(opt_level=3).

henry099 · April 13, 2020, 3:04am

the params in extract_from_program is still fp32 weight. we can use it to create a quantized tasks.

kindlehe · April 17, 2020, 7:07am

@HuiFeng

I failed to relay.build if I use autotune and autotvm.apply_history_best, too.

Have you sovled it ?

henry099 · April 21, 2020, 9:52am

It sees work well,but I am still confuse why the extract_from_programcan use a fp32 weight(params=params),I think it should be a int8 params