TVM's get_output function is time-consuming with Mali openCL on RK3399


Today i got an issue of very bad performance for mobilenet_ssd_300 model (no priorbox and detection’s layer). it’s about 1.9s for “run” function. I could not get 230ms as i reported before and i don’t know why
Otherwise, i tried to reproduce your benchmark result and it’s my result:

  • mobilenet: ~80ms
  • resnet18: 0.19822525763333332 s
  • vgg16: 0.98832358445 s

The resnet18 and vgg16 model’s results is a big different with result here
Here is my script. I don’t use RPC.
Is there something wrong with my implementation?


Can you check if you have the current configs for mali from tophub?


@eqy i checkout the lastest code of tvm and i can reproduce mobilenet_ssd_300’s performance as i reported before (250ms)
i found that we don’t have pre-tunned parameters for this model. So the performance will be better if i use autoTVM?




Yes, there are no guarantees about using fallback configs. The performance will very likely be better with AutoTVM.


@merrymercy Is this the case with CPU run as well?


run on CPU should cost “real” running time.