x86 autotvm tutorial now has lower inference performance than using nnvm ir. The reason is that some incorrect workloads are generated when calling alter_op_layout pass:
WARNING:autotvm:Cannot find config for target=llvm -mcpu=skylake-avx512, workload=('conv2d', (1, 64, 56, 56, 'float32'), (320, 64, 1, 1, 'float32'), (1, 1), (0, 0), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm -mcpu=skylake-avx512, workload=('conv2d', (1, 256, 56, 56, 'float32'), (640, 256, 1, 1, 'float32'), (2, 2), (0, 0), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm -mcpu=skylake-avx512, workload=('conv2d', (1, 512, 28, 28, 'float32'), (1280, 512, 1, 1, 'float32'), (2, 2), (0, 0), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm -mcpu=skylake-avx512, workload=('conv2d', (1, 1024, 14, 14, 'float32'), (2560, 1024, 1, 1, 'float32'), (2, 2), (0, 0), (1, 1), 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
These workloads shouldn’t appear in resnet50.
In https://github.com/dmlc/tvm/blob/master/topi/python/topi/x86/conv2d.py#L289, kernel has incorrect shape for these four workloads. It looks like tinfo is not generate correctly in alter_op_layout pass?