I tried to tune a model for intel x86_64 cpu , target=llvm
autotvm skips depthwise_conv2d_nchw
workloads and generates invalid log lines
autotvm skips depthwise_conv2d_nchw workloads
- autotvm time is only 1-2 sec
----------New Workloads---------------
('depthwise_conv2d_nchw', (1, 960, 9, 9, 'float32'), (960, 1, 3, 3, 'float32'), (1, 1), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 576, 15, 15, 'float32'), (576, 1, 3, 3, 'float32'), (2, 2), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 576, 16, 16, 'float32'), (576, 1, 3, 3, 'float32'), (1, 1), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 384, 16, 16, 'float32'), (384, 1, 3, 3, 'float32'), (1, 1), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 192, 29, 29, 'float32'), (192, 1, 3, 3, 'float32'), (2, 2), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 192, 30, 30, 'float32'), (192, 1, 3, 3, 'float32'), (1, 1), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 144, 57, 57, 'float32'), (144, 1, 3, 3, 'float32'), (2, 2), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 144, 58, 58, 'float32'), (144, 1, 3, 3, 'float32'), (1, 1), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 96, 113, 113, 'float32'), (96, 1, 3, 3, 'float32'), (2, 2), (0, 0), (1, 1), 'float32')
('depthwise_conv2d_nchw', (1, 32, 114, 114, 'float32'), (32, 1, 3, 3, 'float32'), (1, 1), (0, 0), (1, 1), 'float32')
--------------------------------------
Total: 10
Tuning...
[Task 1/10] Current/Best: 14.69/ 14.69 GFLOPS | Progress: (1/1000) | 1.18 s Done.
[Task 2/10] Current/Best: 4.07/ 4.07 GFLOPS | Progress: (1/1000) | 1.11 s Done.
[Task 3/10] Current/Best: 13.57/ 13.57 GFLOPS | Progress: (1/1000) | 1.21 s Done.
[Task 4/10] Current/Best: 5.91/ 5.91 GFLOPS | Progress: (1/1000) | 1.11 s Done.
[Task 5/10] Current/Best: 14.96/ 14.96 GFLOPS | Progress: (1/1000) | 1.14 s Done.
[Task 6/10] Current/Best: 2.28/ 2.28 GFLOPS | Progress: (1/1000) | 1.11 s Done.
[Task 7/10] Current/Best: 5.76/ 5.76 GFLOPS | Progress: (1/1000) | 1.15 s Done.
[Task 8/10] Current/Best: 3.71/ 3.71 GFLOPS | Progress: (1/1000) | 1.17 s Done.
[Task 9/10] Current/Best: 4.51/ 4.51 GFLOPS | Progress: (1/1000) | 1.27 s Done.
[Task 10/10] Current/Best: 4.68/ 4.68 GFLOPS | Progress: (1/1000) | 1.08 s Done.
As a result is generates invalid log lines:
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 32, 114, 114], "float32"], ["TENSOR", [32, 1, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 32, 114, 114, "float32"], [32, 1, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.0004918456], 0, 0.565997838973999, 1563309733.8285108], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 96, 113, 113], "float32"], ["TENSOR", [96, 1, 3, 3], "float32"], [2, 2], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 96, 113, 113, "float32"], [96, 1, 3, 3, "float32"], [2, 2], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.0013306298], 0, 0.5449604988098145, 1563309735.648528], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 144, 58, 58], "float32"], ["TENSOR", [144, 1, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 144, 58, 58, "float32"], [144, 1, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.0005992043], 0, 0.5612428188323975, 1563309737.5331435], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 144, 57, 57], "float32"], ["TENSOR", [144, 1, 3, 3], "float32"], [2, 2], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 144, 57, 57, "float32"], [144, 1, 3, 3, "float32"], [2, 2], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.00034402959999999997], 0, 0.5536956787109375, 1563309739.319174], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 192, 30, 30], "float32"], ["TENSOR", [192, 1, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 192, 30, 30, "float32"], [192, 1, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.00018112869999999999], 0, 0.5282695293426514, 1563309741.111374], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 192, 29, 29], "float32"], ["TENSOR", [192, 1, 3, 3], "float32"], [2, 2], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 192, 29, 29, "float32"], [192, 1, 3, 3, "float32"], [2, 2], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.0002968735], 0, 0.5278668403625488, 1563309742.868396], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 384, 16, 16], "float32"], ["TENSOR", [384, 1, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 384, 16, 16, "float32"], [384, 1, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.0002350044], 0, 0.5902194976806641, 1563309744.7191393], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 576, 16, 16], "float32"], ["TENSOR", [576, 1, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 576, 16, 16, "float32"], [576, 1, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.000547472], 0, 0.5910224914550781, 1563309746.4457006], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 576, 15, 15], "float32"], ["TENSOR", [576, 1, 3, 3], "float32"], [2, 2], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 576, 15, 15, "float32"], [576, 1, 3, 3, "float32"], [2, 2], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.0001127131], 0, 0.6004319190979004, 1563309748.3819141], "v": 0.1}
{"i": ["llvm", "topi_nn_depthwise_conv2d_nchw", [["TENSOR", [1, 960, 9, 9], "float32"], ["TENSOR", [960, 1, 3, 3], "float32"], [1, 1], [0, 0], [1, 1], "float32"], {}, ["depthwise_conv2d_nchw", [1, 960, 9, 9, "float32"], [960, 1, 3, 3, "float32"], [1, 1], [0, 0], [1, 1], "float32"], {"i": 0, "t": "direct", "c": null, "e": []}], "r": [[0.0001809171], 0, 0.4957103729248047, 1563309750.1361008], "v": 0.1}
If I try to run evaluate script with these log lines I get the following error - KeyError: 'tile_ic'
File "/usr/local/lib/python3.6/dist-packages/topi-0.6.dev0-py3.6.egg/topi/x86/conv2d.py", line 441, in _alter_conv2d_layout
ic_bn, oc_bn = cfg["tile_ic"].size[-1], cfg["tile_oc"].size[-1]
File "/usr/local/lib/python3.6/dist-packages/tvm-0.6.dev0-py3.6-linux-x86_64.egg/tvm/autotvm/task/space.py", line 773, in __getitem__
return self._entity_map[name]
KeyError: 'tile_ic'
I also tried to run autotvm for arm/mali - target=tvm.target.mali('rk3399')
- it works fine.
The issues exist on pure llvm
x86_64
I tried older TVM version. Results are below
Wed Jul 3 10:08:04 2019 - 287078c33db85d4f312d8d2457a064442d9d18c3 - Bad
Sun Jun 30 18:05:48 - 6c81d784dc9459d684604fcf4190fda4cb956c1c - Bad
Wed Jun 19 20:29:12 2019 - 05a5c170930ef649d6f196950e680ca16d30d07a - Bad
Mon Jun 10 21:26:15 - 7c1c97d2d8d0a99c752d43f95d92618b62b1f015 - Bad
Sat Jun 8 20:56:58 2019 - a4bc50ebfffe034490330a781ac077d958d43286 - Bad
Thu Jun 6 11:41:50 2019 - d7bc4fdd4789a730b7aadcaf441c3d50b9863f60 - Bad
Thu Jun 6 21:00:19 2019 - 770ac84e74a5d0cb174c1a5402f0752a5a8fbecb - OK
Wed Jun 5 22:03:12 2019 - 5999f7a6d8e174026b35dc938bb11442ffae6995 - OK
Fri May 31 19:42:15 2019 - f6acf2e5f51f9ac48f8d13e095805b7fe3f74bcf - OK
So, the issue was introduced in PR https://github.com/dmlc/tvm/pull/3264
I opened new issue: https://github.com/dmlc/tvm/issues/3557