Auto-tuning with Relay: RPC Tracker Error

Hello,

I’m trying to run the auto-tuning example from here. I uncommented line 130 and commented out lines 131-134 to use the LocalRunner instead of the RPCRunner and uncommented line 259. I didn’t modify the script apart from that.

When running the script it starts tuning until after a couple of hours it crashes with the following error:

    Extract tasks...
    Tuning...
    [Task  1/20]  Current/Best:  465.02/1911.91 GFLOPS | Progress: (1008/2000) | 4120.80 s Done.
    [Task  2/20]  Current/Best:  443.08/3984.69 GFLOPS | Progress: (648/2000) | 8833.43 s Done.
    [Task  3/20]  Current/Best:   51.50/5429.07 GFLOPS | Progress: (1536/2000) | 22423.18 s Done.
    [Task  4/20]  Current/Best: 5041.79/7090.46 GFLOPS | Progress: (1560/2000) | 26721.10 s Done.
    [Task  5/20]  Current/Best:    0.00/   0.00 GFLOPS | Progress: (0/2000) | 0.00 sTraceback (most recent call last):
      File "tune_relay_cuda.py", line 259, in <module>
        tune_and_evaluate(tuning_option)
      File "tune_relay_cuda.py", line 228, in tune_and_evaluate
        tune_tasks(tasks, **tuning_opt)
      File "tune_relay_cuda.py", line 209, in tune_tasks
        autotvm.callback.log_to_file(tmp_log_file)])
      File "/home/a.../.local/lib/python3.7/site-packages/tvm-0.6.dev0-py3.7-linux-x86_64.egg/tvm/autotvm/tuner/xgboost_tuner.py", line 86, in tune
        super(XGBTuner, self).tune(*args, **kwargs)
      File "/home/a.../.local/lib/python3.7/site-packages/tvm-0.6.dev0-py3.7-linux-x86_64.egg/tvm/autotvm/tuner/tuner.py", line 108, in tune
        measure_batch = create_measure_batch(self.task, measure_option)
      File "/home/a.../.local/lib/python3.7/site-packages/tvm-0.6.dev0-py3.7-linux-x86_64.egg/tvm/autotvm/measure/measure.py", line 252, in create_measure_batch
        attach_objects = runner.set_task(task)
      File "/home/a.../.local/lib/python3.7/site-packages/tvm-0.6.dev0-py3.7-linux-x86_64.egg/tvm/autotvm/measure/measure_methods.py", line 341, in set_task
        super(LocalRunner, self).set_task(task)
      File "/home/a.../.local/lib/python3.7/site-packages/tvm-0.6.dev0-py3.7-linux-x86_64.egg/tvm/autotvm/measure/measure_methods.py", line 211, in set_task
        raise RuntimeError("Cannot get remote devices from the tracker. "
    RuntimeError: Cannot get remote devices from the tracker. Please check the status of tracker by 'python -m tvm.exec.query_rpc_tracker --port [THE PORT YOU USE]' and make sure you have free devices on the queue status.

I don’t understand why I’m getting an error about the RPC tracker when using the LocalRunner. And I can’t check the status of the tracker because I never started it.

I’d appreciate any help on how to avoid this error so the tuning can finish.

Hi, I came across the same error when I was using the LocalRunner. Anyone has a solution to it. Many thanks.

Got the same error when I ran auto tuning on resnet18 on gpu(Tesla P100) after 12hrs.

Hi, I got the same error when auto tuning my model. It seems that the timing of triggering the error is uncertain. Any help please?