Auto-tuner for quantized model

Whether autotvm support quantized model for x86 ?I want to Auto-tuning a quantized model like in the tutorials,code like this

quantize

net_quant = relay.quantize.quantize(net, params)

lanuch task

tasks = autotvm.task.extract_from_program(net_quant,target=target, params=params,ops=(relay.op.nn.conv2d,))

load log file

with autotvm.apply_history_best(log_file):
with relay.build_config(opt_level=3):
graph_quant, lib_quant, params_quant = relay.build_module.build(net_quant, target=target, params=params)

error:

KeyError: ‘tile_c’ ,Error during compile func

please help me

Currently this is a bit tricky because the quantize pass does not work with alter_op_layout enabled in opt_level=3.

What are the extracted tasks, and what does it try to tune?

I attemp to tune the ops (conv2d, dense, conv2d_transpose) . i find that when i load log_file no matter what opt_level i set, same error will happen like above “KeyError: ‘tile_c’ ,Error during compile func”. why?

This is likely because of a template mismatch where the alter_op_layout transforms operators from conv2d to conv2d nchwc and the configs do not match the template. Can you print the graph just before tuning and just before compile to check this?

there is no error when i use the nnvm to tune and build with the autotune log_file, but relay can not . Is there relay have something wrong? I also checked the graph before tune and after tune. because i want to autotune quantized mode, nnvm does not seem to support quantitative model.

It is possible, can you print the relay graph before and after tuning?

the net load from relay.frontend.from_onnx

the net after autotvm.task.extract_from_program tune and before autotvm.apply_history_best build

Sorry , the graph is too big, i can show you enough

There is no problem if i not use autotune and autotvm.apply_history_best, just use the relay.build_config(opt_level=3).

the params in extract_from_program is still fp32 weight. we can use it to create a quantized tasks.

@HuiFeng

I failed to relay.build if I use autotune and autotvm.apply_history_best, too.

Have you sovled it ?

It sees work well,but I am still confuse why the extract_from_programcan use a fp32 weight(params=params),I think it should be a int8 params