[auto-tune] Does the auto-tune speed become slower with the auto-tuning proceding?

Hi, I have two questions

  1. I noticed that the time cost for each task is much longer with the auto-tuning proceding, for example, the first task maybe cost about 2000s, but the 26 task cost more than 10000s. My network has 106 task, so I have to wait maybe more than 10 days to auto-tune it? The log is as follows.

  2. I tune the model in a 4 gpu server, I found that cpu occupation is not really high as below, is that normal? Why not all cpu is occupied?

For 1, it might be due to the latency difference of each task. Maybe the workload in the 26-th task has longer latency on average so it takes longer for every trial. And yes, you may need a very long time to tune a deep network.

@comaniac Thank you very much. Another question when tuning, I saw many log says:
Timeout in RPC session, kill…
What does that mean and how I can speed up the tuning process?
Thank you very much.

Best, Edward