[auto-tune] using auto-tune on x86 i3 one thread

dolphintear · May 16, 2019, 1:27am

before auto-tuning, using the inceptinv1 inference from tensorflow, it took about 1.1826 second for one picture input. after using auto-tuning with ops=(relay.op.nn.conv2d,), it speeds up to 0.4735 second for one inference. does it sounds good? and there is another question about this auto-tune, by using the opts of
ops=(relay.op.nn.conv2d,) there are still some conv2ds that have not been optimized, i don’t know what causes this problem. log shows as below:
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 192, 35, 35, ‘float32’), (176, 192, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 256, 35, 35, ‘float32’), (176, 256, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 288, 35, 35, ‘float32’), (176, 288, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 768, 17, 17, ‘float32’), (448, 768, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 768, 17, 17, ‘float32’), (512, 768, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 768, 17, 17, ‘float32’), (576, 768, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 1280, 8, 8, ‘float32’), (1152, 1280, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 2048, 8, 8, ‘float32’), (1152, 2048, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm, workload=(‘dense’, (1, 2048, ‘float32’), (1008, 2048, ‘float32’), 0, ‘float32’). A fallback configuration is used, which may bring great performance regression.

for the op of dense, i know the reason I did not put the op in the ops list in this argument ops=(relay.op.nn.conv2d,), and even if dense was optimized, it has a little impact on the performance. but the other WARNINGs like autotvm:Cannot find config for target=llvm, workload=(‘conv2d’, (1, 2048, 8, 8, ‘float32’), (1152, 2048, 1, 1, ‘float32’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float32’) , why do they still exist after the auto-tuning using relay.op.nn.conv2d? thanks in advance!

merrymercy · May 18, 2019, 7:27am

see this issue https://github.com/dmlc/tvm/issues/2827