Compiling SSD for CUDA target seg faults

I was trying to compile the SSD model in this tutorial for CUDA target. It initially failed here.


I understand that the current nms op only has CPU version. I just want to get the compilation pass to run through first and then work on the nms GPU version.

After changing tvm.make.Min/Max to tvm.min/max, it crashed with a seg fault in Halide.

[21:04:38] /home/ubuntu/unison/tvm/src/pass/arg_binder.cc:87: Trying to bind buffer to another one with lower alignment requirement  required_alignment=8, provided_alignment=4
[21:04:38] /home/ubuntu/unison/tvm/src/arithmetic/int_set.cc:514: cannot evaluate set type Load
[21:04:38] /home/ubuntu/unison/tvm/src/arithmetic/int_set.cc:514: cannot evaluate set type 

Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007fffe4dd9542 in tvm::NodeBase::IncRef (this=<error reading variable: Cannot access memory at address 0x7fffff7fdfe8>)
    at /home/ubuntu/unison/tvm/3rdparty/HalideIR/src/tvm/node/node_base.h:68
68	  void IncRef() {

Anyone knows how to solve this problem? Thanks.

I can confirm this issue.
Initially, the problem in tvm.select(clip, tvm.make.Max(0, tvm.make.Min(1, ox - ow)), ox - ow) is type mismatch. Changing 0 and 1 to float works. However, I found both this solution and using tvm.min/max caused seg fault.
The problem is a infinite recursion in ConvertSSA. Since cpu and gpu have different multibox implementation, this is likely an issue in multibox ir on gpu.

Thanks for looking into this. Can this bug be fixed soon? We are depending on this to support SSD for CUDA target.

Actually this issue is related to NMS on CUDA. The error can be reproduced by enabling nms test on CUDA https://github.com/dmlc/tvm/blob/25a7b46c83f076b9c75d5f325d9e5fe18b10deb7/topi/tests/python/test_topi_vision.py#L46

For a quick solution, plz checkout the previous commit of nms.py file. The most recent commit for nms.py on github works for intel graphics but hasn’t been tested on cuda devices.