Native inference performance on ARM device

Hello,

I’m depolying tvm to Linux platform.

As a intial bringup, I managed to depoly tvm runtime and module compiled by nnvm compiler to ARM device.
And on ARM device, I had a inference test - test app is written in c++ code - for the inference test to Resnet18_v1 and squeezenet1.1

ARM(Exynos5433) device info:

  • CPU : 1.9GHz Quad-Core (Cortex®-A57) + 1.3GHz Quad-Core (Cortex®-A53)
  • GPU : Mali™-T760 MP6

The measured performance on the ARM device is as following,
For mxnet based resnet18_v1 inference performance
CPU : about 280ms, GPU : about 36ms
For mxnet based squeezenet1.1 inference performance
CPU : about 133ms, GPU : about 4.7ms

For the exact measurement, I used performance mode as cpu governer.
As for the GPU performance, the result is surprising to me. Even the output result says correct label.

The result would make sense?

Thanks,
Inki Dae

Yes, they make sense. GPU is way to faster than CPU.

I know GPU is faster than CPU generally. However, GPU is too much faster then CPU.

Yes, its possible depending on GPU configuration (Threads & Blocks).

Have you used AutoTVM to train on the ARM CPU? https://docs.tvm.ai/tutorials/autotvm/tune_nnvm_arm.html

Your GPU results are suspectable.

You should compare your numbers to our benchmark here https://github.com/dmlc/tvm/wiki/Benchmark#mobile-gpu. You can compare the frequency, number of cores of GPU.

You can also validate it through rpc using our benchmark script, which is easier and accurate.

Thanks for reply.
I will try to validate it using the benchmark script.

Thanks,
Inki Dae

By the way, have you tested inference on real device and c++ code not RPC way and python code?
As I mentioned above, the result says correct label and it shown response time too faster than CPU. Anyway, I will test other models also to make sure.

Thanks,
Inki Dae

I’d never used AutoTVM.

I think you should train it using autotvm to get better result.

Thanks for advice. :slight_smile:

hi,

How can I do AutoTVM while my arm platform have no python environment, which means I can’t not use RPC.