[AutoTVM] TimeoutError while tuning on ARM CPU

I am trying to tune MobileNetV3-minmalistic TF Lite Model with TVM targeting arm_cpu. I tuned model for MobileNetV3 without error log with n_trial = 1500, early_stopping = 100. But when I set early_stopping to None, autotvm turn into debug mode with below log.

WARNING:autotvm:Too many errors happen in the tuning. Now is in debug mode
DEBUG:autotvm:No: 217   GFLOPS: 0.00/0.00       result: MeasureResult(costs=(TimeoutError(),), error_no=6, all_cost=10, timestamp=1581575643.881052)    [('tile_co', [-1, 1]), ('tile_oh', [-1, 1]), ('tile_ow', [-1, 1]), ('reorder_conv', [0, 1, 2, 3, 7, 4, 5, 6, 8, 9]), ('ann_reduce'
, ['none', 'unroll']), ('ann_spatial', ['unroll', 'none', 'unroll']), ('compat', 1)],direct,None,3752
DEBUG:autotvm:No: 218   GFLOPS: 0.00/0.00       result: MeasureResult(costs=(InstantiationError(['Too large factor for unrolling', 'Too large factor for unrolling'],),), error_no=1, all_cost=0.024065017700195312, timestamp=1581575643.8814123)      [('tile_co', [-1, 1001]), ('tile_o
h', [-1, 1]), ('tile_ow', [-1, 1]), ('reorder_conv', [0, 1, 2, 7, 3, 4, 5, 8, 6, 9]), ('ann_reduce', ['none', 'none']), ('ann_spatial', ['unroll', 'vec', 'unroll']), ('compat', 0)],direct,None,1951
DEBUG:autotvm:No: 219   GFLOPS: 0.00/0.00       result: MeasureResult(costs=(TimeoutError(),), error_no=6, all_cost=10, timestamp=1581575643.9419343)   [('tile_co', [-1, 91]), ('tile_oh', [-1, 1]), ('tile_ow', [-1, 1]), ('reorder_conv', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]), ('ann_reduce
', ['none', 'unroll']), ('ann_spatial', ['vec', 'none', 'none']), ('compat', 2)],direct,None,7205

Since I see TimeoutError() in debug message, I tried to replace timeout from 10 to 1000 according to other issue, but still got same result. Although there are many error log, Some tasks have non-zero GFLOPS. Then does the tuning process work correctly?

To share my tuning script,

I use tune_and_evaluate same as tune_relay_arm.py. But for using my TF Lite model, I disable get_network and use mod, params, input_shape returns from from_tflite() like below code

tflite_model_buf = open(TFLITE_PATH, "rb").read()
tflite_model = tflite.Model.Model.GetRootAsModel(tflite_model_buf, 0)

input_shape = (1,224,224,3)
mod, params = relay.frontend.from_tflite(tflite_model,
                                         shape_dict={
                                             "input": input_shape},
                                         dtype_dict={"input": "float32"})

tune_and_evaluate(tuning_option, mod, params, input_shape)

And I use below tuning option

tuning_option = {
    'log_filename': log_file,

    'tuner': 'ga',
    'n_trial': 1500,
    'early_stopping': None,

    'measure_option': autotvm.measure_option(
        builder=autotvm.LocalBuilder(
            build_func='ndk' if use_android else 'default'),
        runner=autotvm.RPCRunner(
            device_key, host='0.0.0.0', port=9190,
            number=5,
            timeout=1000,
        ),
    )
}

(I used GATuner cause XGBTuner raised below error)

1 Like
  1. try to remove previous .tmp log and tune it again if use xgb tunner.
  2. Be make sure you have connected RPC correctly (between host and remove board) using python -m tvm.exec.query_rpc_tracker --host=0.0.0.0 --port=9198=0