Okay, sorry for not having looking enough by myself. thanks to your example I fount it was quite easy to implement it by myself with existing features, it just needed a modification of the autotuning script.
So basically I am using to temporay files to save the work done:
[model].log.tmp, checkpoints after each task completed, can be resumed without loss
[model].log.task.tmp, work done for the processing task, cannot be resumed without loss (no need to save it in case of exit or failure)
tmp_log = log + '.tmp'
for i, tsk in enumerate(reversed(tasks)):
# in case of transfer learning use the completed tasks log
if use_transfer_learning and os.path.isfile(tmp_log):
with tempfile.NamedTemporaryFile() as tmp_task_log_file:
# tune in a the blank temporary file
tuner_obj.tune(..., callbacks=[..., autotvm.callback.log_to_file(tmp_task_log.name)])
# task completed, append the task log to the checkpoints log
with open(tmp_log, 'a') as tmp_log_file:
# after tuning each tasks, pick the best ones from the tmp_log and remove tmp_log
That way I can use resume a tuning session as you described, from the checkpoint file ([model].log.tmp or
tmp_log), without any loss of work.