It seems just some specified output channel numbers can work when using nnvm's from_tensorflow


#1

when I set output channel number to be 75, it can work and the log is below, the black lines are convs with output channel.

WARNING:tensorflow:From tvm_test.py:36: FastGFile. init (from tensorflow.python.platform.gfile) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.gfile.GFile.
WARNING:tensorflow:From tvm_test.py:47: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /root/.pyenv/versions/3.5.5/lib/python3.5/site-packages/tensorflow/python/framework/graph_util_impl.py:245: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 3, 418, 418, ‘float16’), (16, 3, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 16, 210, 210, ‘float16’), (32, 16, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 32, 106, 106, ‘float16’), (64, 32, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 64, 54, 54, ‘float16’), (128, 64, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 128, 28, 28, ‘float16’), (256, 128, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 256, 15, 15, ‘float16’), (512, 256, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 512, 15, 15, ‘float16’), (1024, 512, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 1024, 13, 13, ‘float16’), (256, 1024, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 512, 13, 13, ‘float16’), (75, 512, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 256, 13, 13, ‘float16’), (128, 256, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 384, 28, 28, ‘float16’), (256, 384, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 256, 26, 26, ‘float16’), (75, 256, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
[15:52:37] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…
[15:52:37] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…
[15:52:38] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…
[15:52:38] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…

But when I set output channel number to be 18, it can not work and the log is below.

WARNING:tensorflow:From tvm_test.py:36: FastGFile. init (from tensorflow.python.platform.gfile) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.gfile.GFile.
WARNING:tensorflow:From tvm_test.py:47: convert_variables_to_constants (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.convert_variables_to_constants
WARNING:tensorflow:From /root/.pyenv/versions/3.5.5/lib/python3.5/site-packages/tensorflow/python/framework/graph_util_impl.py:245: extract_sub_graph (from tensorflow.python.framework.graph_util_impl) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.compat.v1.graph_util.extract_sub_graph
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 3, 418, 418, ‘float16’), (16, 3, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 16, 210, 210, ‘float16’), (32, 16, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 32, 106, 106, ‘float16’), (64, 32, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 64, 54, 54, ‘float16’), (128, 64, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 128, 28, 28, ‘float16’), (256, 128, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 256, 15, 15, ‘float16’), (512, 256, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 512, 15, 15, ‘float16’), (1024, 512, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 1024, 13, 13, ‘float16’), (256, 1024, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 512, 13, 13, ‘float16’), (18, 512, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 256, 13, 13, ‘float16’), (128, 256, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 384, 28, 28, ‘float16’), (256, 384, 3, 3, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
WARNING:autotvm:Cannot find config for target=opencl -device=mali -model=rk3399, workload=(‘conv2d’, (1, 256, 26, 26, ‘float16’), (18, 256, 1, 1, ‘float16’), (1, 1), (0, 0), (1, 1), ‘NCHW’, ‘float16’). A fallback configuration is used, which may bring great performance regression.
[15:56:24] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…
[15:56:24] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…
[15:56:24] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…
[15:56:24] /tts/tvm_gpu/src/pass/vectorize_loop.cc:343: Detect vector condition in Vectorized Loop, scalarizing…
Traceback (most recent call last):
File “tvm_test.py”, line 65, in
m.run(**{input_node: image_data})
File “/tts/tvm/python/tvm/contrib/graph_runtime.py”, line 151, in run
self._run()
File “/tts/tvm/python/tvm/_ffi/_ctypes/function.py”, line 185, in call
ctypes.byref(ret_val), ctypes.byref(ret_tcode)))
File “/tts/tvm/python/tvm/_ffi/base.py”, line 72, in check_call
raise TVMError(py_str(_LIB.TVMGetLastError()))
tvm._ffi.base.TVMError: [15:56:32] /tts/tvm_gpu/src/runtime/module_util.cc:53: Check failed: ret == 0 (-1 vs. 0) [15:56:32] /tts/tvm_gpu/src/runtime/opencl/opencl_module.cc:216: OpenCL build error for device=0x7fad79fdd8:779:3: error: implicit declarations are not allowed
vstore6(((half6)((half)0.000000e+00f, (half)0.000000e+00f, (half)0.000000e+00f, (half)0.000000e+00f, (half)0.000000e+00f, (half)0.000000e+00f)), 0, conv + (((((((int)get_group_id(2)) * 13) + ((int)get_group_id(1))) * 13) + ((int)get_local_id(0))) * 6));
^

:781:5: error: implicit declarations are not allowed vstore6((vload6(0, conv + (((((((int)get_group_id(2)) * 13) + ((int)get_group_id(1))) * 13) + ((int)get_local_id(0))) * 6)) + (((half6)(data_vec[((((((int)get_group_id(1)) * 13) + ((int)get_local_id(0))) * 512) + ci)], data_vec[((((((int)get_group_id(1)) * 13) + ((int)get_local_id(0))) * 512) + ci)], data_vec[((((((int)get_group_id(1)) * 13) + ((int)get_local_id(0))) * 512) + ci)], data_vec[((((((int)get_group_id(1)) * 13) + ((int)get_local_id(0))) * 512) + ci)], data_vec[((((((int)get_group_id(1)) * 13) + ((int)get_local_id(0))) * 512) + ci)], data_vec[((((((int)get_group_id(1)) * 13) + ((int)get_local_id(0))) * 512) + ci)])) * vload6(0, kernel_vec + (((((int)get_group_id(2)) * 512) + ci) * 6)))), 0, conv + (((((((int)get_group_id(2)) * 13) + ((int)get_group_id(1))) * 13) + ((int)get_local_id(0))) * 6)); ^

error: Compiler frontend failed (error code 59)

Stack trace returned 10 entries:
[bt] (0) /tts/tvm/build/libtvm.so(dmlc::StackTrace[abi:cxx11](unsigned long)+0x13c) [0x7fae112cec]
[bt] (1) /tts/tvm/build/libtvm.so(+0xeb1dd8) [0x7fae738dd8]
[bt] (2) /tts/tvm/build/libtvm.so(+0xeec8c4) [0x7fae7738c4]
[bt] (3) /tts/tvm/build/libtvm.so(tvm::runtime::GraphRuntime::Run()+0x3c) [0x7fae771e04]
[bt] (4) /tts/tvm/build/libtvm.so(TVMFuncCall+0x70) [0x7fae7233c8]
[bt] (5) /root/.pyenv/versions/3.5.5/lib/python3.5/lib-dynload/_ctypes.cpython-35m-aarch64-linux-gnu.so(ffi_call_SYSV+0x64) [0x7fb2779f1c]
[bt] (6) /root/.pyenv/versions/3.5.5/lib/python3.5/lib-dynload/_ctypes.cpython-35m-aarch64-linux-gnu.so(ffi_call+0x128) [0x7fb277b198]
[bt] (7) /root/.pyenv/versions/3.5.5/lib/python3.5/lib-dynload/_ctypes.cpython-35m-aarch64-linux-gnu.so(_ctypes_callproc+0x494) [0x7fb2772e44]
[bt] (8) /root/.pyenv/versions/3.5.5/lib/python3.5/lib-dynload/_ctypes.cpython-35m-aarch64-linux-gnu.so(+0x9b54) [0x7fb2768b54]
[bt] (9) /root/.pyenv/versions/3.5.5/lib/libpython3.5m.so.1.0(PyObject_Call+0x54) [0x7fb31a0a84]