An Error About Auto-tuning the SSD

zhenhouhong · June 25, 2020, 12:19pm

setup

based docker image : tvmai/ci-cpu, the enviroment not change.

ubuntu 16.04
tvm 0.7dev1
mxnet 1.6.0

issue discribe

I follow the tutorial “Auto-tuning a convolutional network for x86 CPU”, change model to the ssd_512_resnet50_v1_voc. After finish the search, it show the information like:

“INFO Start to benchmark layout transformation…”
“INFO Benchmark layout transformation successful.”
“INFO Start to run dynamic program algorithm…”
“INFO Start forward pass…”
“INFO Finished forward pass,”
“INFO Start backward pass…”,

and then came out a error: ValueError: iterator is too large.

Any Suggestion?

other

I ran the tutorial “Auto-tuning a convolutional network for x86 CPU”, it is successful. And other tutorial base cpu, there are fine.

kevinthesun · June 25, 2020, 5:15pm

You can try PBQPTuner

zhenhouhong · June 29, 2020, 10:32am

I also try it on the tvm cuda version, the tvm is based on tvmai/ci-gpu:v0.64 image. tvm dev0.71

It can out the error:

[Task 53/55]  Current/Best: 7418.46/8419.66 GFLOPS | Progress: (960/2000) | 2493.48 s Done.
[Task 54/55]  Current/Best: 6438.97/17485.65 GFLOPS | Progress: (896/2000) | 2699.27 s Done.
[Task 55/55]  Current/Best: 1054.10/5853.12 GFLOPS | Progress: (1984/2000) | 5311.74 s Done.
Compile...
[02:50:51] /root/tvm/src/te/schedule/bound.cc:119: not in feed graph consumer = extern(argsort_gpu, 0x698c6a0)
Evaluate inference time cost...
Traceback (most recent call last):
  File "tune_relay_cuda_ssd.py", line 250, in <module>
tune_and_evaluate(tuning_option)
File "tune_relay_cuda_ssd.py", line 243, 
in tune_and_evaluate prof_res = np.array(ftimer().results) * 1000  # convert to millisecond
File "/root/tvm/python/tvm/runtime/module.py", line 215, 
in evaluator blob = feval(*args)
File "/root/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 225,
in __call__ raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (6) /root/tvm/build/libtvm.so(TVMFuncCall+0x61) [0x7f81d7d14841]
[bt] (5) /root/tvm/build/libtvm.so(+0xea3d67) [0x7f81d7d59d67]
[bt] (4) /root/tvm/build/libtvm.so(+0xea38ca) [0x7f81d7d598ca]
[bt] (3) /root/tvm/build/libtvm.so(tvm::runtime::GraphRuntime::Run()+0x47) [0x7f81d7d6e0e7]
[bt] (2) /root/tvm/build/libtvm.so(+0xeb8057) [0x7f81d7d6e057]
[bt] (1) /root/tvm/build/libtvm.so(+0xe79d66) [0x7f81d7d2fd66]
[bt] (0) /root/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x67) 
[0x7f81d731c367]
File "/root/tvm/src/runtime/library_module.cc", line 78
TVMError: Check failed: ret == 0 (-1 vs. 0) : Assert fail: (num_args == 4), fused_nn_softmax_2: 
num_args should be 4

zhenhouhong · June 29, 2020, 10:39am

Thanks kevin, I change to PBQPTuner, it takes long time, still wait the result.

zhenhouhong · June 29, 2020, 11:48am

Finally run out, it also came out the error:

tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (6) /root/tvm/build/libtvm.so(TVMFuncCall+0x61) [0x7f81d7d14841]
[bt] (5) /root/tvm/build/libtvm.so(+0xea3d67) [0x7f81d7d59d67]
[bt] (4) /root/tvm/build/libtvm.so(+0xea38ca) [0x7f81d7d598ca]
[bt] (3) /root/tvm/build/libtvm.so(tvm::runtime::GraphRuntime::Run()+0x47) [0x7f81d7d6e0e7]
[bt] (2) /root/tvm/build/libtvm.so(+0xeb8057) [0x7f81d7d6e057]
[bt] (1) /root/tvm/build/libtvm.so(+0xe79d66) [0x7f81d7d2fd66]
[bt] (0) /root/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x67) 
[0x7f81d731c367]
File "/root/tvm/src/runtime/library_module.cc", line 78
TVMError: Check failed: ret == 0 (-1 vs. 0) : Assert fail: (num_args == 4), fused_nn_softmax_2: 
num_args should be 4