I’m experiencing an issue this time that I believe may be happening to others as well.
I am using the following pytorch network that is later exported to onnx:
net = nn.Sequential(nn.Conv2d(in_channels=3, out_channels=128, kernel_size=3, dilation=(1,1)),
nn.ReLU(),
nn.Conv2d(in_channels=128, out_channels=128, kernel_size=3, dilation=(1,1)),
nn.ReLU(),
nn.Conv2d(in_channels=128, out_channels=64, kernel_size=3, dilation=(1,1)))
When running it through the example at https://docs.tvm.ai/tutorials/autotvm/tune_nnvm_x86.html, it runs fine until the moment of compiling the final network, where I get
WARNING:autotvm:Cannot find config for target=llvm -mcpu=core-avx2,
workload=('conv2d_NCHWc', (1, 16, 608, 608, 8, 'float32'), (8, 16, 3, 3, 8, 8, 'float32'), (1, 1), (0, 0), (1, 1), 'NCHW8c', 'NCHW8c', 'float32'). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm -mcpu=core-avx2,
workload=('conv2d_NCHWc', (1, 16, 610, 610, 8, 'float32'), (16, 16, 3, 3, 8, 8, 'float32'), (1, 1), (0, 0), (1, 1), 'NCHW8c', 'NCHW8c', 'float32'). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=llvm -mcpu=core-avx2,
workload=('conv2d_NCHWc', (1, 1, 612, 612, 3, 'float32'), (16, 1, 3, 3, 3, 8, 'float32'), (1, 1), (0, 0), (1, 1), 'NCHW3c', 'NCHW8c', 'float32'). A fallback configuration is used, which may bring great performance regression.
This seems weird at first, but looking at the .log file I see:
{"r": [[0.715419564], 0, 3.2787282466888428, 1551285782.4269547], "i": ["llvm -mcpu=core-avx2", "topi_x86_conv2d_NCHWc", [["TENSOR", [1, 128, 608, 608], "float32"],
["TENSOR", [64, 128, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "NCHW", "float32"], {}, ["conv2d", [1, 128, 608, 608, "float32"], [64, 128, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "NCHW", "float32"], {"t": "direct", "e": [["tile_ic", "sp", [32, 4]], ["tile_oc", "sp", [4, 16]], ["tile_ow", "sp", [101, 6]], ["unroll_kw", "ot", true]], "c": null, "i": 202}], "v": 0.1}
{"r": [[0.991738546], 0, 4.449911117553711, 1551285848.6736164], "i": ["llvm -mcpu=core-avx2",
"topi_x86_conv2d_NCHWc", [["TENSOR", [1, 128, 610, 610], "float32"], ["TENSOR", [128, 128, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "NCHW", "float32"], {}, ["conv2d", [1, 128, 610, 610, "float32"], [128, 128, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "NCHW", "float32"], {"t": "direct", "e": [["tile_ic", "sp", [8, 16]], ["tile_oc", "sp", [8, 16]], ["tile_ow", "sp", [152, 4]], ["unroll_kw", "ot", false]], "c": null, "i": 676}], "v": 0.1}
{"r": [[0.03143428715625], 0, 1.458106517791748, 1551285879.0393698], "i": ["llvm -mcpu=core-avx2",
"topi_x86_conv2d_NCHWc", [["TENSOR", [1, 3, 612, 612], "float32"], ["TENSOR", [128, 3, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "NCHW", "float32"], {}, ["conv2d", [1, 3, 612, 612, "float32"], [128, 3, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "NCHW", "float32"], {"t": "direct", "e": [["tile_ic", "sp", [1, 3]], ["tile_oc", "sp", [2, 64]], ["tile_ow", "sp", [610, 1]], ["unroll_kw", "ot", false]], "c": null, "i": 93}], "v": 0.1}
My understanding is that the task args passed to autotvm.task.create
are in the original NCHW ordering, while during the final optimization phase the TVM optimizer tries to find a task with args in NCHWc format, thus failing. (check the indented lines above)
If I change the optimization level to opt_level=2
, then it passes without warnings.
Is there a way to extract the tasks already in NCHWc order? Am I missing something important that is causing this warning?