Possible bug? [RELAY] Internal invariant was violdated

Building the inceptionv3 example from the gluon model zoo, results in an internal invariant violation:

%423 = nn.dense(%422, meta[relay.Constant][470] // , units=1000) // an internal invariant was violdated while typechecking your program [16:18:59] /workspace/src/relay/pass/type_solver.cc Check failed: resolved.defined(): Unable to unify parent types: TensorType([1000, ((2048*((-3/8) + 1))*((-3/8) + 1))], float32) and TensorType([1000, 2048], float32)
;

Issue was found with a variant of the “compile mxnet models” tutorial by replacing the model “resnet18_v1” with “inceptionv3”

Any ideas?

In this particular case, it seems to indicate a problem in operator translation. Note the negative number in the shape. Likely due to certain implementations of reshape or something alike. However, likely the problem of reshape has already been fixed by the latest master… Please confirm if you still get the problem of the latest master, and dig a big what is the operator before the internal invariant violate

@mjs have you looked a bit deeper into what is going on?

@tqchen I’ve not looked further into this failure.

Just wondering whether there is any updates? I met a similar problem, however, not sure how to solve it.

I install the TVM about half month ago by cloning from “https://github.com/dmlc/tvm”, so I thought it should be the latest version.

Any comments are greatly appreciated. Thanks.

The bug report is attached:

fn () {
free_var %data/Placeholder: Tensor[(64, 224, 224, 3), float32]
%0 = nn.pad(%data/Placeholder, pad_width=[[0, 0], [3, 3], [3, 3], [0, 0]])
free_var %model/resnet_model/conv2d/kernel: Tensor[(7, 7, 3, 64), float32]
%1 = nn.conv2d(%0, %model/resnet_model/conv2d/kernel, strides=[2, 2], channels=64, kernel_size=[7, 7], data_layout=“NHWC”, kernel_layout=“HWIO”)
%2 = nn.max_pool2d(%1, pool_size=[3, 3], strides=[2, 2], padding=[0, 0, 1, 1], layout=“NHWC”)
free_var %model/resnet_model/batch_normalization/gamma: Tensor[(64,), float32]
free_var %model/resnet_model/batch_normalization/beta: Tensor[(64,), float32]
free_var %model/resnet_model/batch_normalization/Const: Tensor[(0,), float32]
free_var %model/resnet_model/batch_normalization/Const_1: Tensor[(0,), float32]
%3 = nn.batch_norm(%2, %model/resnet_model/batch_normalization/gamma, %model/resnet_model/batch_normalization/beta, %model/resnet_model/batch_normalization/Const, %model/resnet_model/batch_normalization/Const_1, axis=3, epsilon=1.001e-05)an internal invariant was violated while typechecking your program [11:44:17] /apsarapangu/disk3/jiandong.mjd/tvm/src/relay/pass/type_solver.cc:119: Check failed: resolved.defined(): Unable to unify parent types: TensorType([64], float32) and TensorType([0], float32)
;
%3.0
}

it seems that this problem is caused by the dismatch of the tensor shapes in the batch_norm layer.

I load the graph from the ckpt file and import to TVM using relay.frontend.from_tensorflow. I am quite confident that the shapes in the ckpt file are correct. Not sure why the “moving mean” and “moving var” become “const” with a shape of [0] when parsed by relay.

I met a similar problem as you can see my posts above, any comments from you are greatly appreciated. Thanks.

I believe the problem is due to an incorrect input size. According to the information at the mxnet model zoo website, the input size should be 299 for the InceptionV3 model. The tutorial script has an input size of 224, so changing all occurrences of 224 to 299 fixes the problem.

1 Like

Thanks @joshherr, issue resolved.