Thanks for your reply. I had done tuning according to the tune_relay_x86 tutorial yesterday by setting target = tvm.target.create("llvm -mcpu=broadwell")
, but the inferencing time was longer than before tuning. And it reported warnings as below:
……
Cannot find config for target=llvm -device=tracing, workload=('conv2d', (1, 128, 56, 56, 'float32'), (128, 128, 3, 3, 'float32'), (2, 2), (1, 1), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=tracing, workload=('conv2d', (1, 128, 28, 28, 'float32'), (256, 128, 3, 3, 'float32'), (1, 1), (1, 1), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=tracing, workload=('conv2d', (1, 256, 28, 28, 'float32'), (256, 256, 3, 3, 'float32'), (2, 2), (1, 1), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=tracing, workload=('conv2d', (1, 256, 14, 14, 'float32'), (512, 256, 3, 3, 'float32'), (1, 1), (1, 1), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
Cannot find config for target=llvm -device=tracing, workload=('conv2d', (1, 512, 14, 14, 'float32'), (512, 512, 3, 3, 'float32'), (2, 2), (1, 1), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
2019-07-26 02:12:28,616 INFO Start to benchmark layout transformation...
2019-07-26 02:19:41,903 INFO Benchmarking layout transformation successful.
2019-07-26 02:19:41,926 INFO Start to run dynamic programming algorithm...
2019-07-26 02:19:41,926 INFO Start forward pass...
2019-07-26 02:19:42,538 INFO Finished forward pass.
2019-07-26 02:19:42,538 INFO Start backward pass...
2019-07-26 02:19:42,540 INFO Finished backward pass...
2019-07-26 02:19:42,540 INFO Finished DPExecutor run.
2019-07-26 02:19:42,542 INFO Writing optimal schedules to mxnet-r50_cpu_graph_opt.log successfully.
Compile...
Config for target=llvm -mcpu=broadwell, workload=('dense', (1, 25088, 'float32'), (512, 25088, 'float32'), 0, 'float32') is missing in ApplyGraphBest context. A fallback configuration is used, which may bring great performance regression.
How to slove it?