[Relay][ONNX]load Resnet.onnx to relay failed


#1

resnet_model_path = ‘resnet.onnx’
onnx_model = onnx.load(resnet_model_path)

target = ‘llvm’
input_name = ‘0’
resnet_input = np.random.rand(1,3,224,224)
shape_dict = {input_name: resnet_input.shape}
sym, params = relay.frontend.from_onnx(onnx_model,shape_dict,‘float32’)
print (‘sym’, sym)
print (‘params’, params)

with relay.build_config(opt_level=1):
intrp = relay.build_module.create_executor(‘graph’, sym, tvm.cpu(0), target)
tvm_output = intrp.evaluate(sym)(tvm.nd.array( resnet_input.astype(‘float32’)), **params).asnumpy()

When i load resnet.onnx model into relay, I encounter the issue:
concatenate(%451)an internal invariant was violated while typechecking your program [15:21:56] /home/zgy/tvm/src/relay/op/tensor/transform.cc:204: Check failed: e_dtype == dtype (int64 vs. int32) : relay.concatenate requires all tensors have the same dtype


Load onnx to relay error
#2

@zhanggy358.

add this line of code below shape_dict.

dtype_dict = {input_name:resnet_input.dtype}


#3

I change the dtype as torch.float32, but there is also this error

this is resnet-101 code:
dummy_input = Variable(torch.randn(1,3,224,224))
print (‘dtype’, dummy_input.dtype) #torch.float32
output = torch.onnx.export(model,
dummy_input,
‘resnet-101.onnx’,
export_params=True,
verbose=True)

this is from_onnx.py code:
resnet_input = Variable(torch.randn(1,3,224,224))
print (‘dtype’, resnet_input.dtype) #torch.float32
shape_dict = {input_name: resnet_input.shape}
sym, params = relay.frontend.from_onnx(onnx_model, shape_dict, resnet_input.dtype)


#4

Probably converting the torch.float32 to np.float32 would help.
I don’t think so if TVM has support for torch data types.


#5

ok, I change the datatype the same as float32, but have same issue.


#6

It seems that concatenate only support expected dtype int64 in relay, I think tvm team should extend the data type for this operation


#7

Concatenate supports all datatypes, but requires that the datatypes for conctenated tensors are the same. Can you send a reproducible script, and/or a dump of the IR?

You can usually obtain this by calling print(relay_expr)?


#8

this is resnet18:

%93 = fn (%v0: Tensor[(1, 3, 224, 224), float32]) {
%0 = nn.conv2d(%v0, meta[relay.Constant][0] // , strides=[2, 2], padding=[3, 3], kernel_size=[7, 7]) //
%1 = nn.batch_norm(%0, meta[relay.Constant][1] // , meta[relay.Constant][2] // , meta[relay.Constant][3] // , meta[relay.Constant][4] // , epsilon=1e-05) //
%2 = %1.0
%3 = nn.relu(%2) //
%4 = nn.max_pool2d(%3, pool_size=[3, 3], strides=[2, 2], padding=[1, 1]) //
%5 = nn.conv2d(%4, meta[relay.Constant][5] // , padding=[1, 1], kernel_size=[3, 3]) //
%6 = nn.batch_norm(%5, meta[relay.Constant][6] // , meta[relay.Constant][7] // , meta[relay.Constant][8] // , meta[relay.Constant][9] // , epsilon=1e-05) //
%7 = %6.0
%8 = nn.relu(%7) //
%9 = nn.conv2d(%8, meta[relay.Constant][10] // , padding=[1, 1], kernel_size=[3, 3]) //
%10 = nn.batch_norm(%9, meta[relay.Constant][11] // , meta[relay.Constant][12] // , meta[relay.Constant][13] // , meta[relay.Constant][1 4] // , epsilon=1e-05) //
%11 = %10.0
%12 = add(%11, %4) //
%13 = nn.relu(%12) //
%14 = nn.conv2d(%13, meta[relay.Constant][15] // , padding=[1, 1], kernel_size=[3, 3]) //
%15 = nn.batch_norm(%14, meta[relay.Constant][16] // , meta[relay.Constant][17] // , meta[relay.Constant][18] // , meta[relay.Constant][ 19] // , epsilon=1e-05) //
%16 = %15.0
%17 = nn.relu(%16) //
%18 = nn.conv2d(%17, meta[relay.Constant][20] // , padding=[1, 1], kernel_size=[3, 3]) //
%19 = nn.batch_norm(%18, meta[relay.Constant][21] // , meta[relay.Constant][22] // , meta[relay.Constant][23] // , meta[relay.Constant][ 24] // , epsilon=1e-05) //
%20 = %19.0
%21 = add(%20, %13) //
%22 = nn.relu(%21) //
%23 = nn.conv2d(%22, meta[relay.Constant][25] // , strides=[2, 2], padding=[1, 1], kernel_size=[3, 3]) //
%24 = nn.batch_norm(%23, meta[relay.Constant][26] // , meta[relay.Constant][27] // , meta[relay.Constant][28] // , meta[relay.Constant][ 29] // , epsilon=1e-05) //
%25 = %24.0
%26 = nn.relu(%25) //
%27 = nn.conv2d(%26, meta[relay.Constant][30] // , padding=[1, 1], kernel_size=[3, 3]) //
%28 = nn.batch_norm(%27, meta[relay.Constant][31] // , meta[relay.Constant][32] // , meta[relay.Constant][33] // , meta[relay.Constant][ 34] // , epsilon=1e-05) //
%29 = %28.0
%30 = nn.conv2d(%22, meta[relay.Constant][35] // , strides=[2, 2], kernel_size=[1, 1]) //
%31 = nn.batch_norm(%30, meta[relay.Constant][36] // , meta[relay.Constant][37] // , meta[relay.Constant][38] // , meta[relay.Constant][ 39] // , epsilon=1e-05) //
%32 = %31.0
%33 = add(%29, %32) //
%34 = nn.relu(%33) //
%35 = nn.conv2d(%34, meta[relay.Constant][40] // , padding=[1, 1], kernel_size=[3, 3]) //
%36 = nn.batch_norm(%35, meta[relay.Constant][41] // , meta[relay.Constant][42] // , meta[relay.Constant][43] // , meta[relay.Constant][ 44] // , epsilon=1e-05) //
%37 = %36.0
%38 = nn.relu(%37) //
%39 = nn.conv2d(%38, meta[relay.Constant][45] // , padding=[1, 1], kernel_size=[3, 3]) //
%40 = nn.batch_norm(%39, meta[relay.Constant][46] // , meta[relay.Constant][47] // , meta[relay.Constant][48] // , meta[relay.Constant][ 49] // , epsilon=1e-05) //
%41 = %40.0
%42 = add(%41, %34) //
%43 = nn.relu(%42) //
%44 = nn.conv2d(%43, meta[relay.Constant][50] // , strides=[2, 2], padding=[1, 1], kernel_size=[3, 3]) //
%45 = nn.batch_norm(%44, meta[relay.Constant][51] // , meta[relay.Constant][52] // , meta[relay.Constant][53] // , meta[relay.Constant][ 54] // , epsilon=1e-05) //
%46 = %45.0
%47 = nn.relu(%46) //
%48 = nn.conv2d(%47, meta[relay.Constant][55] // , padding=[1, 1], kernel_size=[3, 3]) //
%49 = nn.batch_norm(%48, meta[relay.Constant][56] // , meta[relay.Constant][57] // , meta[relay.Constant][58] // , meta[relay.Constant][ 59] // , epsilon=1e-05) //
%50 = %49.0
%51 = nn.conv2d(%43, meta[relay.Constant][60] // , strides=[2, 2], kernel_size=[1, 1]) //
%52 = nn.batch_norm(%51, meta[relay.Constant][61] // , meta[relay.Constant][62] // , meta[relay.Constant][63] // , meta[relay.Constant][ 64] // , epsilon=1e-05) //
%53 = %52.0
%54 = add(%50, %53) //
%55 = nn.relu(%54) //
%56 = nn.conv2d(%55, meta[relay.Constant][65] // , padding=[1, 1], kernel_size=[3, 3]) //
%57 = nn.batch_norm(%56, meta[relay.Constant][66] // , meta[relay.Constant][67] // , meta[relay.Constant][68] // , meta[relay.Constant][ 69] // , epsilon=1e-05) //
%58 = %57.0
%59 = nn.relu(%58) //
%60 = nn.conv2d(%59, meta[relay.Constant][70] // , padding=[1, 1], kernel_size=[3, 3]) //
%61 = nn.batch_norm(%60, meta[relay.Constant][71] // , meta[relay.Constant][72] // , meta[relay.Constant][73] // , meta[relay.Constant][ 74] // , epsilon=1e-05) //
%62 = %61.0
%63 = add(%62, %55) //
%64 = nn.relu(%63) //
%65 = nn.conv2d(%64, meta[relay.Constant][75] // , strides=[2, 2], padding=[1, 1], kernel_size=[3, 3]) //
%66 = nn.batch_norm(%65, meta[relay.Constant][76] // , meta[relay.Constant][77] // , meta[relay.Constant][78] // , meta[relay.Constant][ 79] // , epsilon=1e-05) //
%67 = %66.0
%68 = nn.relu(%67) //
%69 = nn.conv2d(%68, meta[relay.Constant][80] // , padding=[1, 1], kernel_size=[3, 3]) //
%70 = nn.batch_norm(%69, meta[relay.Constant][81] // , meta[relay.Constant][82] // , meta[relay.Constant][83] // , meta[relay.Constant][ 84] // , epsilon=1e-05) //
%71 = %70.0
%72 = nn.conv2d(%64, meta[relay.Constant][85] // , strides=[2, 2], kernel_size=[1, 1]) //
%73 = nn.batch_norm(%72, meta[relay.Constant][86] // , meta[relay.Constant][87] // , meta[relay.Constant][88] // , meta[relay.Constant][ 89] // , epsilon=1e-05) //
%74 = %73.0
%75 = add(%71, %74) //
%76 = nn.relu(%75) //
%77 = nn.conv2d(%76, meta[relay.Constant][90] // , padding=[1, 1], kernel_size=[3, 3]) //
%78 = nn.batch_norm(%77, meta[relay.Constant][91] // , meta[relay.Constant][92] // , meta[relay.Constant][93] // , meta[relay.Constant][ 94] // , epsilon=1e-05) //
%79 = %78.0
%80 = nn.relu(%79) //
%81 = nn.conv2d(%80, meta[relay.Constant][95] // , padding=[1, 1], kernel_size=[3, 3]) //
%82 = nn.batch_norm(%81, meta[relay.Constant][96] // , meta[relay.Constant][97] // , meta[relay.Constant][98] // , meta[relay.Constant][ 99] // , epsilon=1e-05) //
%83 = %82.0
%84 = add(%83, %76) //
%85 = nn.relu(%84) //
%86 = nn.global_avg_pool2d(%85) //
%87 = shape_of(%86, dtype=“int32”) //
%88 = take(%87, int64(0), axis=0) //
%89 = expand_dims(%88, axis=0) //
%90 = expand_dims(int64(-1), axis=0) //
%91 = (%89, %90)
%92 = concatenate(%91) // an internal invariant was violdated while typechecking your program [10:45:05] /tvm/src/relay/op/tens or/transform.cc:204: Check failed: e_dtype == dtype (int64 vs. int32) : relay.concatenate requires all tensors have the same dtype
;
%92
}
%93


#9

Hi, has the problem been solved?


#10

I also meet this problem, anyone know how to solve it?


#11

No, it has not been solved until now


#12

The problem seems was due to how we handle constant in onnx frontend. By default, all the constant integer are defaulted to int64, but in this case seems int32 was the desirable choice. That is why the type confliction(int64 vs int32) occurred.

cc @jroesch @zhreshold @kazum @Huyuwei please see if we could fix this


#13

I looked into this issue it was due to a bug when doing transitive import from PyTorch -> ONNX -> Relay.

See https://github.com/dmlc/tvm/pull/3230


#14

Try this tool to simplify onnx model first and then use tvm. Worked for me


#15

nice work, solved my problem!! thx a lot~


#16

There is now code in master that fixes some of these cases, let us know if you have further issues.


#17

I’m the author of onnx-simplifier. I’m very glad that it helps you :slight_smile: