NDArray conversion to numpy array take time (around 350ms) with a simple data

Hi,

I have built & tuned successfully the mxnet gluoncv model, and the result (classIds, scores, boxes) of the model referenced is NDArray. I want to draw boxes of class Person only, so I need to select these results by the condition.

I got a big issue of performance with the tvm.runtime.ndarray.NDArray conversion. The conversion to numpy array always costs around 300ms.

>  for i, cl in enumerate(scores.asnumpy()[0]):
>             prop = cl[0]
>             if prop < 0.5:
>                 continue
>             cl_id = int(class_IDs.asnumpy()[0][i][0])
>             if voc_classes[cl_id] != 'person':
>                 continue
>             bbox = boxs.asnumpy()[0][i]
> 
>             
>             cv2.rectangle(frame, (int(bbox[0]), int(bbox[1])), (int(bbox[2]), int(bbox[3])), 0, 3)
> 
>             temboxes.append(bbox)
>             temscore.append(prop)

Is there any way or method that I can filter these results better?

Many thanks!!

There is actually one way to do zero-copy from numpy array to DLPack compatible arrays (like tvm ndarray, pytorch tensor, etc)

1 Like

Thank junrushao1994 !!

It is likely the inference hasn’t been finished when you call asnumpy(). That is why asnumpy takes some time to complete.

2 Likes

I think the same but it’s quite strange :frowning:

No it is not strange, it is expected. module.run() is async-ish, to be sure you can insert explicit sync after run()

1 Like

Hello! Could you tell me how to do zero-copy form numpy array to DLPack compatible arrays?

Hello! I met the same problem,could you tell me how to insert the explicit sync? I don’t know the whole function name it is. Thank you very much!

I think it is ctx.sync().

Yeah here is the example: dlpack/apps/from_numpy at main · dmlc/dlpack · GitHub. Note that we assume the array is on CPU because numpy arrays are cpu-only.