Convert PyTorch to NNVM/TVM: accuracy is zero

eric · July 20, 2018, 10:49pm

Hi,

I am new in NNVM/TVM, and I encounter accuracy problem when trying to convert pre-trained PyTorch model (ResNet50) to NNVM/TVM.

Problem description:

When I use pre-trained ResNet50 PyTorch model for prediction, the performance is
top 1 accuracy : 0.46
top 5 accuracy : 0.79
However, both accuracies become zero after I convert to TVM and execute.

I also use ONNX-TensorFlow backend to run prediction and the accuracy is
onnx-tf top1 accuracy : 0.5
onnx-tf top5 accuracy : 0.81

What I did

I didn’t find straight way to convert PyTorch to NNVM/TVM, thus I convert PyTorch to ONNX representation first, and use this tutorial to compile ONNX to NNVM and execute using TVM.

How to reproduce

To reproduce my issue, please download my source codes here.

Big Thanks!

tqchen · July 21, 2018, 5:19pm

this is a bit strange, can you try to run the onnx tutorial through some of your images and see what is going on? it would due to mismatch in preprocessing

eric · July 23, 2018, 8:10pm

@tqchen Thanks for your reply!

Sorry maybe I didn’t describe it precisely. We have 3 experiments.

First, I use PyTorch pre-trained model, the performance is:
top 1 accuracy : 0.46
top 5 accuracy : 0.79

Then, I convert PyTorch to ONNX. Since ONNX doesn’t support PyTorch import, I import to TensorFlow and run inference. The performance is:
top1 accuracy : 0.5
top5 accuracy : 0.81

Last, I convert PyTorch to ONNX, and then from ONNX to NNVM/TVM, the performance is:
top1 accuracy: 0.0
top5 accuracy: 0.02

Therefore, I think maybe the issue is during ONNX to NNVM/TVM step, or maybe I didn’t use TVM stack correctly.

tqchen · July 23, 2018, 8:20pm

that is why i suggest trying to run through the tutorial and try out one image to see what it classifies to, to see if there is any mismatch in the pipeline

jjiang2cal · September 14, 2018, 11:00pm

@tqchen
I have similar experience with caffe2 squeezenet. The conversion from caffe2 to onnx succeeded. When I import the onnx model to caffe2, the top 1 accuracy is 0.63. However, when I convert the onnx to nnvm and run on tvm, I got top 1 accuracy as 0.0 (also top 5 accuracy). The model is from https://github.com/caffe2/models/tree/master/squeezenet, with input_shape (1, 3, 227, 227). I do change the input_shape to the correct one.

When I examine the tvm_out (as in the code below), I found all the elements are 1. in it, so every prediction (argmax) is just 0. It is the same for every input image.

    module.set_input(**params)
    module.set_input(input_name, tvm.nd.array(img_.astype('float32')))
    module.run()
    out = module.get_output(0, tvm.nd.empty(output_shape, 'float32'))
    tvm_out = out.asnumpy()
    pred = tvm_out.argmax()

tvm_out:

>>> tvm_out
array([[1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
        ......,
        1., 1., 1., 1., 1., 1., 1., 1.]], dtype=float32)

jjiang2cal · September 25, 2018, 11:30pm

@masahi

Any idea on this pytorch or caffe2 issue? Thanks!

masahi · September 26, 2018, 12:10am

I don’t know, but if you can give me your script, onnx model, and an input image to reproduce your issue, I can have a look.

jjiang2cal · September 26, 2018, 6:08pm

@masahi

My script, caffe2 model, and input images are uploaded to:

Images and the labels are in dataset/, and the code is in caffe2/src.

Thank you very much!

masahi · September 26, 2018, 11:02pm

thanks, can you also upload onnx model? I don’t want to go through a trouble of installing caffe2.
And can you tell me why you are saving nnvm model to .so and then load it to do nnvm inference?

jjiang2cal · October 2, 2018, 4:55pm

@masahi

The onnx model is uploaded to onnx_nnvm/caffe2/onnx_models/

Saving nnvm to files and loading them is just to avoid compiling time when I debug the code and comment out the compiling part.

Thanks.

hao_lin · October 16, 2018, 11:29am

Hi,I have converted many pre-traineded Pytorch model to ONNX and then compile correctly with NNVM/TVM.
By comparing, pytorch and tvm get the same results~
I have also used tvm C++ api with the「 graph(.json) params(.param) and the lib(.so) 」which complied with NNVM.
All was normal.

jjiang2cal · October 16, 2018, 9:22pm

@hao_lin
What models have you tried?

david · August 10, 2020, 1:09am

hi, i met the same issue, did you fix it?how? thx very much.