Relocations in generic ELF (EM: 62) when tuning for NVIDIA GPU on TX2

I am trying to tune a single convolutional layer for NVIDIA GPU on Jetson TX2. I am referencing the tutorial in tvm/tutorials/autotvm/tune_nnvm_cuda.py.

The layer is defined in PyTorch:

model = nn.Sequential(nn.Conv2d(in_chns, out_chns, ksize, stride, pad, groups=1, bias=use_bias),)

and exported via ONNX, then imported into NNVM.

Cross-compiling on a host machine (x86_64-linux-gnu) targeting CUDA on TX2 (target=tvm.target.cuda() and target_host="llvm -target=aarch64-linux-gnu")

When I try to run the autotuner, I always see 0.00/ 0.00 GFLOPS:

[Task 1/ 1] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (96/1000) | 87.82 s

I tried enabling the logger as suggested in the tutorial and got this output:

INFO:autotvm:Get devices for measurement successfully!
DEBUG:autotvm:No: 1	GFLOPS: 0.00/0.00	result: MeasureResult(costs=(RuntimeError('Except caught from RPC call: TVMCall CFunc Error:\nTraceback (most recent call last):\n  File "/home/nvidia/tvm/python/tvm/_ffi/_ctypes/function.py", line 54, in cfun\n    try:\n  File "/home/nvidia/tvm/python/tvm/rpc/server.py", line 50, in load_module\n    m = _load_module(path)\n  File "/home/nvidia/tvm/python/tvm/module.py", line 222, in load\n    _cc.create_shared(path + ".so", files)\n  File "/home/nvidia/tvm/python/tvm/contrib/cc.py", line 33, in create_shared\n    _linux_shared(output, objects, options, cc)\n  File "/home/nvidia/tvm/python/tvm/contrib/cc.py", line 58, in _linux_shared\n    raise RuntimeError(msg)\nRuntimeError: Compilation error:\n/usr/bin/ld: /tmp/tmphj1i03/lib.o: Relocations in generic ELF (EM: 62)\n/usr/bin/ld: /tmp/tmphj1i03/lib.o: Relocations in generic ELF (EM: 62)\n/tmp/tmphj1i03/lib.o: error adding symbols: File in wrong format\ncollect2: error: ld returned 1 exit status\n\n',),), error_no=4, all_cost=2.0872275829315186, timestamp=1536378632.1542888)	[('tile_f', [1, 1, 1, 1]), ('tile_y', [2, 56, 2, 1]), ('tile_x', [14, 1, 8, 2]), ('tile_rc', [1, 3]), ('tile_ry', [5, 1]), ('tile_rx', [1, 5]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],direct,None,271508
DEBUG:autotvm:No: 2	GFLOPS: 0.00/0.00	result: MeasureResult(costs=(RuntimeError('Except caught from RPC call: TVMCall CFunc Error:\nTraceback (most recent call last):\n  File "/home/nvidia/tvm/python/tvm/_ffi/_ctypes/function.py", line 54, in cfun\n    try:\n  File "/home/nvidia/tvm/python/tvm/rpc/server.py", line 50, in load_module\n    m = _load_module(path)\n  File "/home/nvidia/tvm/python/tvm/module.py", line 222, in load\n    _cc.create_shared(path + ".so", files)\n  File "/home/nvidia/tvm/python/tvm/contrib/cc.py", line 33, in create_shared\n    _linux_shared(output, objects, options, cc)\n  File "/home/nvidia/tvm/python/tvm/contrib/cc.py", line 58, in _linux_shared\n    raise RuntimeError(msg)\nRuntimeError: Compilation error:\n/usr/bin/ld: /tmp/tmpZ3S7Sa/lib.o: Relocations in generic ELF (EM: 62)\n/usr/bin/ld: /tmp/tmpZ3S7Sa/lib.o: Relocations in generic ELF (EM: 62)\n/tmp/tmpZ3S7Sa/lib.o: error adding symbols: File in wrong format\ncollect2: error: ld returned 1 exit status\n\n',),), error_no=4, all_cost=3.4724888801574707, timestamp=1536378631.6049912)	[('tile_f', [1, 1, 1, 1]), ('tile_y', [112, 1, 2, 1]), ('tile_x', [8, 14, 1, 2]), ('tile_rc', [1, 3]), ('tile_ry', [5, 1]), ('tile_rx', [1, 5]), ('auto_unroll_max_step', 512), ('unroll_explicit', 0)],direct,None,667532
DEBUG:autotvm:No: 3	GFLOPS: 0.00/0.00	result: MeasureResult(costs=(RuntimeError('Except caught from RPC call: TVMCall CFunc Error:\nTraceback (most recent call last):\n  File "/home/nvidia/tvm/python/tvm/_ffi/_ctypes/function.py", line 54, in cfun\n    try:\n  File "/home/nvidia/tvm/python/tvm/rpc/server.py", line 50, in load_module\n    m = _load_module(path)\n  File "/home/nvidia/tvm/python/tvm/module.py", line 222, in load\n    _cc.create_shared(path + ".so", files)\n  File "/home/nvidia/tvm/python/tvm/contrib/cc.py", line 33, in create_shared\n    _linux_shared(output, objects, options, cc)\n  File "/home/nvidia/tvm/python/tvm/contrib/cc.py", line 58, in _linux_shared\n    raise RuntimeError(msg)\nRuntimeError: Compilation error:\n/usr/bin/ld: /tmp/tmpVrOrAD/lib.o: Relocations in generic ELF (EM: 62)\n/usr/bin/ld: /tmp/tmpVrOrAD/lib.o: Relocations in generic ELF (EM: 62)\n/tmp/tmpVrOrAD/lib.o: error adding symbols: File in wrong format\ncollect2: error: ld returned 1 exit status\n\n',),), error_no=4, all_cost=2.699096441268921, timestamp=1536378632.7313952)	[('tile_f', [1, 1, 1, 1]), ('tile_y', [1, 16, 2, 7]), ('tile_x', [112, 1, 1, 2]), ('tile_rc', [3, 1]), ('tile_ry', [5, 1]), ('tile_rx', [1, 5]), ('auto_unroll_max_step', 0), ('unroll_explicit', 0)],direct,None,214964

One of the responses in this thread

Can't run RPC GPU tutorial on my own device

encountered EM: 62 errors as well, so I made sure to set the target and target host correctly. However, the problem still persists.

Does anyone have suggestions on what could be causing this?

Where do you add target_host ?

You should add it here

Do you use RPC Tracker? Maybe you can share your full script.

I was adding target_host in the compiler function but not in the tuning function. This resolved my issue. Thank you!

And for reference, yes, I was using the RPC tracker to tune on the TX2.

Have you solved this problem? I also got same errors.