TVM ResNet low performance



I am doing autotvm-tune-relay-x86 tutorial from documentation and noticed some strange performance results. Unfortunately, there is no info regarding what CPU has been used for presented in the tutorial “Sample Output”, but looking at other settings (again - assuming those are correct) numbers I’m getting are still very low. Yes, this is autoTVM, but those numbers can be somehow associated with final inference performance.

During my tuning GFLOPs I am getting are around 4-5 times lower than the one from Sample Output. I am running Ubuntu 18.04 on i7-6700k with following configuration:
target = "llvm -mcpu=core-avx2"
batch_size = 1
dtype = "float32"
model_name = "resnet-18"
num_threads = 4
os.environ["TVM_NUM_THREADS"] = str(num_threads)

Am I using some wrong configuration for that, or is it expected result?


It is expected. The number reported in the tutorial is from AWS c5.9xlarge instance (18 physical core and AVX-512 instruction enabled), with target="llvm -mcpu=skylake-avx512".

Would you like to modify the tutorial improve the description for the sample out?