Issue with Pad Op importing ONNX models

skoppula · November 6, 2018, 10:21pm

I’m trying a simple import of a ResNet ONNX model and encountering issues.

In particular, I’m using:

import onnx
onnx_model = onnx.load_model(args.model)
sym, params = nnvm.frontend.from_onnx(onnx_model)

Using the following code to export the model to ONNX:

model = resnet.__dict__['resnet110']()
checkpoint = torch.load('pretrained_models/resnet110.th')
model.load_state_dict(checkpoint['state_dict'])

dummy_input = torch.autograd.Variable(torch.randn(1, 3, 32, 32))
input_names = [ "data" ]
output_names = [ "output" ]
torch.onnx.export(model, dummy_input, 'cifar10_resnet110.onnx', input_names=input_names, output_names=output_names)

(Full code to generate the ONNX and reproduce on CPU available at https://gist.github.com/skoppula/6773d5ba37499bd0c5045e0ec9c0b9c4 (link to the ONNX also in the Gist), pretrained PyTorch model that is imported available here https://github.com/akamaster/pytorch_resnet_cifar10/raw/master/pretrained_models/resnet110.th.)

I am obtaining bizarre issues during import of the Pad Op:

Traceback (most recent call last):
  File "onnx_model_to_shared_library_cpu.py", line 31, in main
    sym, params = nnvm.frontend.from_onnx(onnx_model)
  File "nnvm-0.8.0-py3.6.egg/nnvm/frontend/onnx.py", line 974, in from_onnx
    sym, params = g.from_onnx(graph, opset)
  File "nnvm/frontend/onnx.py", line 829, in from_onnx
    op = self._convert_operator(op_name, inputs, attr, opset)
  File "nnvm/frontend/onnx.py", line 930, in _convert_operator
    sym = convert_map[op_name](inputs, attrs, self._params)
  File "nnvm/frontend/onnx.py", line 207, in _impl_v1
    channels = _infer_channels(inputs[1], params, True)
IndexError: list index out of range

I suspect this is a bug in the ONNX import code, because the ONNX file passes ONNX validation check_graph, and I’m able to run inference in other frameworks with the same ONNX model.

Any help or pointers appreciated.

skoppula · November 7, 2018, 12:32am

It appears as though my ONNX model was a bit wonky/non-compatible in some ways. I’m not sure if this is a PyTorch export problem, or import problem, but MXNet was unable to import the model as well.

Ended up using a ResNet110 model trained in and exported to ONNX in MXNet and this worked fine.

I will investigate the root cause another time.