I followed x86 Auto-tuning tutorial to tune tensorflow mobilenet_v1_1.0_224_frozen.pb
model for AtomicPi (CPU: Atom x5-Z8350)
The spreadsheet below shows debug evaluation times for each operations for
- UNTUNED model
- HISTORY_BEST log
- GRAPH_OPT log
spreadsheet tune mobilenet_v1_1.0_224_frozen.pb for Atomicpi
For some reasons UNTUNED model shows about 40% faster overall result.
I also noticed that tuned and untuned operators shapes are different
I highlighted records with significant Time differences.
Red - when tuned shows worse time
Green - tuned shows better time
I recommend to use View-Zoom 50-75% on laptop screen
Autotune log files are on dropbox
14773 Jul 30 05:24 mobilenet_v1_1.0_224_frozen.pb_graph_opt.log
5589213 Jul 30 05:24 mobilenet_v1_1.0_224_frozen.pb.log
10560 Jul 31 21:42 mobilenet_v1_1.0_224_frozen.pb.pick_best.log
I used the following tuning options
tuning_option = {
'log_filename': log_file,
'tuner': 'random',
'early_stopping': None,
'measure_option': autotvm.measure_option(
builder=autotvm.LocalBuilder(),
runner=autotvm.RPCRunner(
device_key, host='localhost', port=tr_port,
number=5,
timeout=10,
),
),
}