Mobilenet_V1 compilation problem

Hello,

I found a possible bug/feature in TVM when using opt_level=3 while compiling MobileNet_V1 (https://github.com/keras-team/keras-applications/blob/master/keras_applications/mobilenet.py). The Relay code at the point of the error looks like this:

%304 = transpose(%303, axes=[0, 3, 1, 2]);
  %305 = layout_transform(%304, src_layout="NCHW", dst_layout="NCHW8c") an internal invariant was violated while typechecking your program [11:42:56] /Users/alopez/Documents/Code/tvm/include/tvm/tir/expr.h:143: Check failed: a.dtype() == b.dtype(): TypeError: mismatched types

; ;
  %306 = layout_transform(meta[relay.Constant][81], src_layout="OIHW", dst_layout="OIHW8i8o");
  %307 = nn.contrib_conv2d_NCHWc(%305, %306, padding=[0, 0, 0, 0], channels=1000, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %308 = layout_transform(%307, src_layout="NCHW8c", dst_layout="NCHW") an internal invariant was violated while typechecking your program [11:42:56] /Users/alopez/Documents/Code/tvm/src/relay/op/tensor/transform.cc:2328: Check failed: data != nullptr: 
; ;
  %309 = transpose(%308, axes=[0, 2, 3, 1]);
  %310 = add(%309, meta[relay.Constant][82]);
  %311 = reshape(%310, newshape=[1, 1000]);
  nn.softmax(%311)

After some searching I found a possible connection to ([Bug][Relay] AlterOpLayout attempting bad conversions for CPU) at the very end of the discussion. Based on this and on a hunch I tried setting opt_level=2 and the problem went away. So perhaps it’s an overly aggressive optimization? I haven’t had time to look deeper into this, but wanted to share a possible solution in case someone else finds this problem.

cc @anijain2305 @kevinthesun

Is it possible to have a script to reproduce the error?

Sure, this is the minimum code to reproduce:

from tvm import relay
import tensorflow as tf
#
#
# Model parameters
#
myTarget = "llvm"
layout = "NCHW"
model_name = "model.pb"
input_name = "input_1"
output_name = ["act_softmax/Softmax"]
dshape = (1, 224, 224, 3)
shape_dict = {input_name: dshape}
#
# Read frozen PB file from MobileNetV1
#
with tf.io.gfile.GFile("./" + model_name, 'rb') as f:
    graph_def = tf.compat.v1.GraphDef()
    graph_def.ParseFromString(f.read())
    graph = tf.import_graph_def(graph_def, name='')

sym, params =\
    relay.frontend.from_tensorflow(graph_def,
                                   layout=layout,
                                   shape={input_name: dshape},
                                   outputs=output_name)

with relay.build_config(opt_level=3):
    graph, lib, params = relay.build(sym, myTarget, params=params)

    lib.export_library("./model.so")
    with open("./model.json", "w") as fo:
        fo.write(graph)
    with open("./params.params", "wb") as fo:
        fo.write(relay.save_param_dict(params))

I am reading a frozen session file that has converted variables to constants. I can upload that too but its a 16M file. The model comes from (https://github.com/keras-team/keras-applications/blob/master/keras_applications/mobilenet.py) though.

Error log:

  File "tvm_Opt_error.py", line 47, in <module>
    graph, lib, params = relay.build(sym, myTarget, params=params)

  File "/Users/alopez/Documents/Code/tvm/python/tvm/relay/build_module.py", line 251, in build
    graph_json, mod, params = bld_mod.build(mod, target, target_host, params)

  File "/Users/alopez/Documents/Code/tvm/python/tvm/relay/build_module.py", line 120, in build
    self._build(mod, target, target_host)

  File "/Users/alopez/Documents/Code/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 216, in __call__
    raise get_last_ffi_error()

tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) 9   libtvm.dylib                        0x000000012066a55e tvm::transform::SequentialNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const + 1070
  [bt] (7) 8   libtvm.dylib                        0x000000012066a943 tvm::transform::Pass::operator()(tvm::IRModule, tvm::transform::PassContext const&) const + 179
  [bt] (6) 7   libtvm.dylib                        0x0000000120e7650f tvm::relay::transform::FunctionPassNode::operator()(tvm::IRModule, tvm::transform::PassContext const&) const + 1647
  [bt] (5) 6   libtvm.dylib                        0x0000000120649520 tvm::IRModuleNode::Add(tvm::GlobalVar const&, tvm::BaseFunc const&, bool) + 320
  [bt] (4) 5   libtvm.dylib                        0x0000000120648977 tvm::RunTypeCheck(tvm::IRModule const&, tvm::GlobalVar const&, tvm::relay::Function) + 1431
  [bt] (3) 4   libtvm.dylib                        0x0000000120dbfd85 tvm::relay::InferType(tvm::relay::Function const&, tvm::IRModule const&, tvm::GlobalVar const&) + 565
  [bt] (2) 3   libtvm.dylib                        0x0000000120dbedd1 tvm::relay::TypeInferencer::Infer(tvm::RelayExpr) + 145
  [bt] (1) 2   libtvm.dylib                        0x000000012063b125 tvm::ErrorReporter::RenderErrors(tvm::IRModule const&, bool) + 5477
  [bt] (0) 1   libtvm.dylib                        0x0000000120530909 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  [bt] (8) 9   libtvm.dylib                        0x0000000120648977 tvm::RunTypeCheck(tvm::IRModule const&, tvm::GlobalVar const&, tvm::relay::Function) + 1431
  [bt] (7) 8   libtvm.dylib                        0x0000000120dbfd85 tvm::relay::InferType(tvm::relay::Function const&, tvm::IRModule const&, tvm::GlobalVar const&) + 565
  [bt] (6) 7   libtvm.dylib                        0x0000000120dbedb5 tvm::relay::TypeInferencer::Infer(tvm::RelayExpr) + 117
  [bt] (5) 6   libtvm.dylib                        0x0000000120c8e4af tvm::relay::TypeSolver::Solve() + 559
  [bt] (4) 5   libtvm.dylib                        0x0000000120c8ed75 tvm::TypedEnvFunc<bool (tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>::operator()(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&) const + 325
  [bt] (3) 4   libtvm.dylib                        0x000000012099d53b std::__1::__function::__func<void tvm::runtime::TypedPackedFunc<bool (tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>::AssignTypedLambda<bool (*)(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>(bool (*)(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&))::'lambda'(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*), std::__1::allocator<void tvm::runtime::TypedPackedFunc<bool (tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>::AssignTypedLambda<bool (*)(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>(bool (*)(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&))::'lambda'(tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*)>, void (tvm::runtime::TVMArgs, tvm::runtime::TVMRetValue*)>::operator()(tvm::runtime::TVMArgs&&, tvm::runtime::TVMRetValue*&&) + 107
  [bt] (2) 3   libtvm.dylib                        0x000000012099d6e3 void tvm::runtime::detail::unpack_call_dispatcher<bool, 0, 4, bool (*)(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&)>::run<tvm::runtime::TVMMovableArgValue_, tvm::runtime::TVMMovableArgValue_, tvm::runtime::TVMMovableArgValue_, tvm::runtime::TVMMovableArgValue_>(bool (* const&)(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&), tvm::runtime::TVMArgs const&, tvm::runtime::TVMRetValue*, tvm::runtime::TVMMovableArgValue_&&, tvm::runtime::TVMMovableArgValue_&&, tvm::runtime::TVMMovableArgValue_&&, tvm::runtime::TVMMovableArgValue_&&) + 323
  [bt] (1) 2   libtvm.dylib                        0x0000000120bbd75a tvm::relay::LayoutTransformRel(tvm::Array<tvm::Type, void> const&, int, tvm::Attrs const&, tvm::TypeReporter const&) + 234
  [bt] (0) 1   libtvm.dylib                        0x0000000120530909 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  [bt] (8) 9   libtvm.dylib                        0x00000001207ea106 tvm::tir::BijectiveLayout::ForwardShape(tvm::Array<tvm::PrimExpr, void> const&) const + 182
  [bt] (7) 8   libtvm.dylib                        0x00000001207ead01 tvm::tir::TransformShape(tvm::Array<tvm::PrimExpr, void> const&, tvm::Array<tvm::tir::IterVar, void> const&, tvm::Array<tvm::tir::IterVar, void> const&, tvm::Array<tvm::PrimExpr, void> const&) + 2993
  [bt] (6) 7   libtvm.dylib                        0x0000000120867e67 tvm::tir::Substitute(tvm::PrimExpr, std::__1::unordered_map<tvm::tir::VarNode const*, tvm::PrimExpr, std::__1::hash<tvm::tir::VarNode const*>, std::__1::equal_to<tvm::tir::VarNode const*>, std::__1::allocator<std::__1::pair<tvm::tir::VarNode const* const, tvm::PrimExpr> > > const&) + 71
  [bt] (5) 6   libtvm.dylib                        0x0000000120553835 tvm::tir::ExprFunctor<tvm::PrimExpr (tvm::PrimExpr const&)>::VisitExpr(tvm::PrimExpr const&) + 53
  [bt] (4) 5   libtvm.dylib                        0x0000000120553c51 tvm::NodeFunctor<tvm::PrimExpr (tvm::runtime::ObjectRef const&, tvm::tir::ExprFunctor<tvm::PrimExpr (tvm::PrimExpr const&)>*)>::operator()(tvm::runtime::ObjectRef const&, tvm::tir::ExprFunctor<tvm::PrimExpr (tvm::PrimExpr const&)>*) const + 305
  [bt] (3) 4   libtvm.dylib                        0x0000000120557168 tvm::tir::ExprFunctor<tvm::PrimExpr (tvm::PrimExpr const&)>::InitVTable()::'lambda10'(tvm::runtime::ObjectRef const&, tvm::tir::ExprFunctor<tvm::PrimExpr (tvm::PrimExpr const&)>*)::__invoke(tvm::runtime::ObjectRef const&, tvm::tir::ExprFunctor<tvm::PrimExpr (tvm::PrimExpr const&)>*) + 24
  [bt] (2) 3   libtvm.dylib                        0x000000012080f745 tvm::tir::ExprMutator::VisitExpr_(tvm::tir::FloorDivNode const*) + 133
  [bt] (1) 2   libtvm.dylib                        0x00000001207fd86a tvm::tir::BinaryOpNode<tvm::tir::FloorDivNode>::make(tvm::PrimExpr, tvm::PrimExpr) + 394
  [bt] (0) 1   libtvm.dylib                        0x0000000120530909 dmlc::LogMessageFatal::~LogMessageFatal() + 57
  File "/Users/alopez/Documents/Code/tvm/src/ir/error.cc", line 133
TVMError: 
Error(s) have occurred. The program has been annotated with them:

In `main`: 
v0.0.4
fn (%input_1: Tensor[(1, 224, 224, 3), float32]) -> Tensor[(1, 1000), float32] {
  %0 = nn.pad(%input_1, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]);
  %1 = transpose(%0, axes=[0, 3, 1, 2]);
  %2 = layout_transform(%1, src_layout="NCHW", dst_layout="NCHW3c");
  %3 = layout_transform(meta[relay.Constant][0], src_layout="OIHW", dst_layout="OIHW3i8o");
  %4 = nn.contrib_conv2d_NCHWc(%2, %3, strides=[2, 2], padding=[0, 0, 0, 0], channels=32, kernel_size=[3, 3], data_layout="NCHW3c", kernel_layout="OIHW3i8o", out_layout="NCHW8c");
  %5 = layout_transform(%4, src_layout="NCHW8c", dst_layout="NCHW");
  %6 = transpose(%5, axes=[0, 2, 3, 1]);
  %7 = cast(%6, dtype="float32");
  %8 = multiply(%7, meta[relay.Constant][1]);
  %9 = add(%8, meta[relay.Constant][2]);
  %10 = cast(%9, dtype="float32");
  %11 = clip(%10, a_min=0f, a_max=6f);
  %12 = transpose(%11, axes=[0, 3, 1, 2]);
  %13 = layout_transform(%12, src_layout="NCHW", dst_layout="NCHW8c");
  %14 = layout_transform(meta[relay.Constant][3], src_layout="OIHW", dst_layout="OIHW1i8o");
  %15 = nn.contrib_depthwise_conv2d_NCHWc(%13, %14, padding=[1, 1, 1, 1], groups=32, channels=32, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %16 = layout_transform(%15, src_layout="NCHW8c", dst_layout="NCHW");
  %17 = transpose(%16, axes=[0, 2, 3, 1]);
  %18 = cast(%17, dtype="float32");
  %19 = multiply(%18, meta[relay.Constant][4]);
  %20 = add(%19, meta[relay.Constant][5]);
  %21 = cast(%20, dtype="float32");
  %22 = clip(%21, a_min=0f, a_max=6f);
  %23 = transpose(%22, axes=[0, 3, 1, 2]);
  %24 = layout_transform(%23, src_layout="NCHW", dst_layout="NCHW8c");
  %25 = layout_transform(meta[relay.Constant][6], src_layout="OIHW", dst_layout="OIHW8i8o");
  %26 = nn.contrib_conv2d_NCHWc(%24, %25, padding=[0, 0, 0, 0], channels=64, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %27 = layout_transform(%26, src_layout="NCHW8c", dst_layout="NCHW");
  %28 = transpose(%27, axes=[0, 2, 3, 1]);
  %29 = cast(%28, dtype="float32");
  %30 = multiply(%29, meta[relay.Constant][7]);
  %31 = add(%30, meta[relay.Constant][8]);
  %32 = cast(%31, dtype="float32");
  %33 = clip(%32, a_min=0f, a_max=6f);
  %34 = nn.pad(%33, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]);
  %35 = transpose(%34, axes=[0, 3, 1, 2]);
  %36 = layout_transform(%35, src_layout="NCHW", dst_layout="NCHW8c");
  %37 = layout_transform(meta[relay.Constant][9], src_layout="OIHW", dst_layout="OIHW1i8o");
  %38 = nn.contrib_depthwise_conv2d_NCHWc(%36, %37, strides=[2, 2], padding=[0, 0, 0, 0], groups=64, channels=64, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %39 = layout_transform(%38, src_layout="NCHW8c", dst_layout="NCHW");
  %40 = transpose(%39, axes=[0, 2, 3, 1]);
  %41 = cast(%40, dtype="float32");
  %42 = multiply(%41, meta[relay.Constant][10]);
  %43 = add(%42, meta[relay.Constant][11]);
  %44 = cast(%43, dtype="float32");
  %45 = clip(%44, a_min=0f, a_max=6f);
  %46 = transpose(%45, axes=[0, 3, 1, 2]);
  %47 = layout_transform(%46, src_layout="NCHW", dst_layout="NCHW8c");
  %48 = layout_transform(meta[relay.Constant][12], src_layout="OIHW", dst_layout="OIHW8i8o");
  %49 = nn.contrib_conv2d_NCHWc(%47, %48, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %50 = layout_transform(%49, src_layout="NCHW8c", dst_layout="NCHW");
  %51 = transpose(%50, axes=[0, 2, 3, 1]);
  %52 = cast(%51, dtype="float32");
  %53 = multiply(%52, meta[relay.Constant][13]);
  %54 = add(%53, meta[relay.Constant][14]);
  %55 = cast(%54, dtype="float32");
  %56 = clip(%55, a_min=0f, a_max=6f);
  %57 = transpose(%56, axes=[0, 3, 1, 2]);
  %58 = layout_transform(%57, src_layout="NCHW", dst_layout="NCHW8c");
  %59 = layout_transform(meta[relay.Constant][15], src_layout="OIHW", dst_layout="OIHW1i8o");
  %60 = nn.contrib_depthwise_conv2d_NCHWc(%58, %59, padding=[1, 1, 1, 1], groups=128, channels=128, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %61 = layout_transform(%60, src_layout="NCHW8c", dst_layout="NCHW");
  %62 = transpose(%61, axes=[0, 2, 3, 1]);
  %63 = cast(%62, dtype="float32");
  %64 = multiply(%63, meta[relay.Constant][16]);
  %65 = add(%64, meta[relay.Constant][17]);
  %66 = cast(%65, dtype="float32");
  %67 = clip(%66, a_min=0f, a_max=6f);
  %68 = transpose(%67, axes=[0, 3, 1, 2]);
  %69 = layout_transform(%68, src_layout="NCHW", dst_layout="NCHW8c");
  %70 = layout_transform(meta[relay.Constant][18], src_layout="OIHW", dst_layout="OIHW8i8o");
  %71 = nn.contrib_conv2d_NCHWc(%69, %70, padding=[0, 0, 0, 0], channels=128, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %72 = layout_transform(%71, src_layout="NCHW8c", dst_layout="NCHW");
  %73 = transpose(%72, axes=[0, 2, 3, 1]);
  %74 = cast(%73, dtype="float32");
  %75 = multiply(%74, meta[relay.Constant][19]);
  %76 = add(%75, meta[relay.Constant][20]);
  %77 = cast(%76, dtype="float32");
  %78 = clip(%77, a_min=0f, a_max=6f);
  %79 = nn.pad(%78, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]);
  %80 = transpose(%79, axes=[0, 3, 1, 2]);
  %81 = layout_transform(%80, src_layout="NCHW", dst_layout="NCHW8c");
  %82 = layout_transform(meta[relay.Constant][21], src_layout="OIHW", dst_layout="OIHW1i8o");
  %83 = nn.contrib_depthwise_conv2d_NCHWc(%81, %82, strides=[2, 2], padding=[0, 0, 0, 0], groups=128, channels=128, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %84 = layout_transform(%83, src_layout="NCHW8c", dst_layout="NCHW");
  %85 = transpose(%84, axes=[0, 2, 3, 1]);
  %86 = cast(%85, dtype="float32");
  %87 = multiply(%86, meta[relay.Constant][22]);
  %88 = add(%87, meta[relay.Constant][23]);
  %89 = cast(%88, dtype="float32");
  %90 = clip(%89, a_min=0f, a_max=6f);
  %91 = transpose(%90, axes=[0, 3, 1, 2]);
  %92 = layout_transform(%91, src_layout="NCHW", dst_layout="NCHW8c");
  %93 = layout_transform(meta[relay.Constant][24], src_layout="OIHW", dst_layout="OIHW8i8o");
  %94 = nn.contrib_conv2d_NCHWc(%92, %93, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %95 = layout_transform(%94, src_layout="NCHW8c", dst_layout="NCHW");
  %96 = transpose(%95, axes=[0, 2, 3, 1]);
  %97 = cast(%96, dtype="float32");
  %98 = multiply(%97, meta[relay.Constant][25]);
  %99 = add(%98, meta[relay.Constant][26]);
  %100 = cast(%99, dtype="float32");
  %101 = clip(%100, a_min=0f, a_max=6f);
  %102 = transpose(%101, axes=[0, 3, 1, 2]);
  %103 = layout_transform(%102, src_layout="NCHW", dst_layout="NCHW8c");
  %104 = layout_transform(meta[relay.Constant][27], src_layout="OIHW", dst_layout="OIHW1i8o");
  %105 = nn.contrib_depthwise_conv2d_NCHWc(%103, %104, padding=[1, 1, 1, 1], groups=256, channels=256, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %106 = layout_transform(%105, src_layout="NCHW8c", dst_layout="NCHW");
  %107 = transpose(%106, axes=[0, 2, 3, 1]);
  %108 = cast(%107, dtype="float32");
  %109 = multiply(%108, meta[relay.Constant][28]);
  %110 = add(%109, meta[relay.Constant][29]);
  %111 = cast(%110, dtype="float32");
  %112 = clip(%111, a_min=0f, a_max=6f);
  %113 = transpose(%112, axes=[0, 3, 1, 2]);
  %114 = layout_transform(%113, src_layout="NCHW", dst_layout="NCHW8c");
  %115 = layout_transform(meta[relay.Constant][30], src_layout="OIHW", dst_layout="OIHW8i8o");
  %116 = nn.contrib_conv2d_NCHWc(%114, %115, padding=[0, 0, 0, 0], channels=256, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %117 = layout_transform(%116, src_layout="NCHW8c", dst_layout="NCHW");
  %118 = transpose(%117, axes=[0, 2, 3, 1]);
  %119 = cast(%118, dtype="float32");
  %120 = multiply(%119, meta[relay.Constant][31]);
  %121 = add(%120, meta[relay.Constant][32]);
  %122 = cast(%121, dtype="float32");
  %123 = clip(%122, a_min=0f, a_max=6f);
  %124 = nn.pad(%123, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]);
  %125 = transpose(%124, axes=[0, 3, 1, 2]);
  %126 = layout_transform(%125, src_layout="NCHW", dst_layout="NCHW8c");
  %127 = layout_transform(meta[relay.Constant][33], src_layout="OIHW", dst_layout="OIHW1i8o");
  %128 = nn.contrib_depthwise_conv2d_NCHWc(%126, %127, strides=[2, 2], padding=[0, 0, 0, 0], groups=256, channels=256, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %129 = layout_transform(%128, src_layout="NCHW8c", dst_layout="NCHW");
  %130 = transpose(%129, axes=[0, 2, 3, 1]);
  %131 = cast(%130, dtype="float32");
  %132 = multiply(%131, meta[relay.Constant][34]);
  %133 = add(%132, meta[relay.Constant][35]);
  %134 = cast(%133, dtype="float32");
  %135 = clip(%134, a_min=0f, a_max=6f);
  %136 = transpose(%135, axes=[0, 3, 1, 2]);
  %137 = layout_transform(%136, src_layout="NCHW", dst_layout="NCHW8c");
  %138 = layout_transform(meta[relay.Constant][36], src_layout="OIHW", dst_layout="OIHW8i8o");
  %139 = nn.contrib_conv2d_NCHWc(%137, %138, padding=[0, 0, 0, 0], channels=512, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %140 = layout_transform(%139, src_layout="NCHW8c", dst_layout="NCHW");
  %141 = transpose(%140, axes=[0, 2, 3, 1]);
  %142 = cast(%141, dtype="float32");
  %143 = multiply(%142, meta[relay.Constant][37]);
  %144 = add(%143, meta[relay.Constant][38]);
  %145 = cast(%144, dtype="float32");
  %146 = clip(%145, a_min=0f, a_max=6f);
  %147 = transpose(%146, axes=[0, 3, 1, 2]);
  %148 = layout_transform(%147, src_layout="NCHW", dst_layout="NCHW8c");
  %149 = layout_transform(meta[relay.Constant][39], src_layout="OIHW", dst_layout="OIHW1i8o");
  %150 = nn.contrib_depthwise_conv2d_NCHWc(%148, %149, padding=[1, 1, 1, 1], groups=512, channels=512, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %151 = layout_transform(%150, src_layout="NCHW8c", dst_layout="NCHW");
  %152 = transpose(%151, axes=[0, 2, 3, 1]);
  %153 = cast(%152, dtype="float32");
  %154 = multiply(%153, meta[relay.Constant][40]);
  %155 = add(%154, meta[relay.Constant][41]);
  %156 = cast(%155, dtype="float32");
  %157 = clip(%156, a_min=0f, a_max=6f);
  %158 = transpose(%157, axes=[0, 3, 1, 2]);
  %159 = layout_transform(%158, src_layout="NCHW", dst_layout="NCHW8c");
  %160 = layout_transform(meta[relay.Constant][42], src_layout="OIHW", dst_layout="OIHW8i8o");
  %161 = nn.contrib_conv2d_NCHWc(%159, %160, padding=[0, 0, 0, 0], channels=512, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %162 = layout_transform(%161, src_layout="NCHW8c", dst_layout="NCHW");
  %163 = transpose(%162, axes=[0, 2, 3, 1]);
  %164 = cast(%163, dtype="float32");
  %165 = multiply(%164, meta[relay.Constant][43]);
  %166 = add(%165, meta[relay.Constant][44]);
  %167 = cast(%166, dtype="float32");
  %168 = clip(%167, a_min=0f, a_max=6f);
  %169 = transpose(%168, axes=[0, 3, 1, 2]);
  %170 = layout_transform(%169, src_layout="NCHW", dst_layout="NCHW8c");
  %171 = layout_transform(meta[relay.Constant][45], src_layout="OIHW", dst_layout="OIHW1i8o");
  %172 = nn.contrib_depthwise_conv2d_NCHWc(%170, %171, padding=[1, 1, 1, 1], groups=512, channels=512, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %173 = layout_transform(%172, src_layout="NCHW8c", dst_layout="NCHW");
  %174 = transpose(%173, axes=[0, 2, 3, 1]);
  %175 = cast(%174, dtype="float32");
  %176 = multiply(%175, meta[relay.Constant][46]);
  %177 = add(%176, meta[relay.Constant][47]);
  %178 = cast(%177, dtype="float32");
  %179 = clip(%178, a_min=0f, a_max=6f);
  %180 = transpose(%179, axes=[0, 3, 1, 2]);
  %181 = layout_transform(%180, src_layout="NCHW", dst_layout="NCHW8c");
  %182 = layout_transform(meta[relay.Constant][48], src_layout="OIHW", dst_layout="OIHW8i8o");
  %183 = nn.contrib_conv2d_NCHWc(%181, %182, padding=[0, 0, 0, 0], channels=512, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %184 = layout_transform(%183, src_layout="NCHW8c", dst_layout="NCHW");
  %185 = transpose(%184, axes=[0, 2, 3, 1]);
  %186 = cast(%185, dtype="float32");
  %187 = multiply(%186, meta[relay.Constant][49]);
  %188 = add(%187, meta[relay.Constant][50]);
  %189 = cast(%188, dtype="float32");
  %190 = clip(%189, a_min=0f, a_max=6f);
  %191 = transpose(%190, axes=[0, 3, 1, 2]);
  %192 = layout_transform(%191, src_layout="NCHW", dst_layout="NCHW8c");
  %193 = layout_transform(meta[relay.Constant][51], src_layout="OIHW", dst_layout="OIHW1i8o");
  %194 = nn.contrib_depthwise_conv2d_NCHWc(%192, %193, padding=[1, 1, 1, 1], groups=512, channels=512, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %195 = layout_transform(%194, src_layout="NCHW8c", dst_layout="NCHW");
  %196 = transpose(%195, axes=[0, 2, 3, 1]);
  %197 = cast(%196, dtype="float32");
  %198 = multiply(%197, meta[relay.Constant][52]);
  %199 = add(%198, meta[relay.Constant][53]);
  %200 = cast(%199, dtype="float32");
  %201 = clip(%200, a_min=0f, a_max=6f);
  %202 = transpose(%201, axes=[0, 3, 1, 2]);
  %203 = layout_transform(%202, src_layout="NCHW", dst_layout="NCHW8c");
  %204 = layout_transform(meta[relay.Constant][54], src_layout="OIHW", dst_layout="OIHW8i8o");
  %205 = nn.contrib_conv2d_NCHWc(%203, %204, padding=[0, 0, 0, 0], channels=512, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %206 = layout_transform(%205, src_layout="NCHW8c", dst_layout="NCHW");
  %207 = transpose(%206, axes=[0, 2, 3, 1]);
  %208 = cast(%207, dtype="float32");
  %209 = multiply(%208, meta[relay.Constant][55]);
  %210 = add(%209, meta[relay.Constant][56]);
  %211 = cast(%210, dtype="float32");
  %212 = clip(%211, a_min=0f, a_max=6f);
  %213 = transpose(%212, axes=[0, 3, 1, 2]);
  %214 = layout_transform(%213, src_layout="NCHW", dst_layout="NCHW8c");
  %215 = layout_transform(meta[relay.Constant][57], src_layout="OIHW", dst_layout="OIHW1i8o");
  %216 = nn.contrib_depthwise_conv2d_NCHWc(%214, %215, padding=[1, 1, 1, 1], groups=512, channels=512, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %217 = layout_transform(%216, src_layout="NCHW8c", dst_layout="NCHW");
  %218 = transpose(%217, axes=[0, 2, 3, 1]);
  %219 = cast(%218, dtype="float32");
  %220 = multiply(%219, meta[relay.Constant][58]);
  %221 = add(%220, meta[relay.Constant][59]);
  %222 = cast(%221, dtype="float32");
  %223 = clip(%222, a_min=0f, a_max=6f);
  %224 = transpose(%223, axes=[0, 3, 1, 2]);
  %225 = layout_transform(%224, src_layout="NCHW", dst_layout="NCHW8c");
  %226 = layout_transform(meta[relay.Constant][60], src_layout="OIHW", dst_layout="OIHW8i8o");
  %227 = nn.contrib_conv2d_NCHWc(%225, %226, padding=[0, 0, 0, 0], channels=512, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %228 = layout_transform(%227, src_layout="NCHW8c", dst_layout="NCHW");
  %229 = transpose(%228, axes=[0, 2, 3, 1]);
  %230 = cast(%229, dtype="float32");
  %231 = multiply(%230, meta[relay.Constant][61]);
  %232 = add(%231, meta[relay.Constant][62]);
  %233 = cast(%232, dtype="float32");
  %234 = clip(%233, a_min=0f, a_max=6f);
  %235 = transpose(%234, axes=[0, 3, 1, 2]);
  %236 = layout_transform(%235, src_layout="NCHW", dst_layout="NCHW8c");
  %237 = layout_transform(meta[relay.Constant][63], src_layout="OIHW", dst_layout="OIHW1i8o");
  %238 = nn.contrib_depthwise_conv2d_NCHWc(%236, %237, padding=[1, 1, 1, 1], groups=512, channels=512, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %239 = layout_transform(%238, src_layout="NCHW8c", dst_layout="NCHW");
  %240 = transpose(%239, axes=[0, 2, 3, 1]);
  %241 = cast(%240, dtype="float32");
  %242 = multiply(%241, meta[relay.Constant][64]);
  %243 = add(%242, meta[relay.Constant][65]);
  %244 = cast(%243, dtype="float32");
  %245 = clip(%244, a_min=0f, a_max=6f);
  %246 = transpose(%245, axes=[0, 3, 1, 2]);
  %247 = layout_transform(%246, src_layout="NCHW", dst_layout="NCHW8c");
  %248 = layout_transform(meta[relay.Constant][66], src_layout="OIHW", dst_layout="OIHW8i8o");
  %249 = nn.contrib_conv2d_NCHWc(%247, %248, padding=[0, 0, 0, 0], channels=512, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %250 = layout_transform(%249, src_layout="NCHW8c", dst_layout="NCHW");
  %251 = transpose(%250, axes=[0, 2, 3, 1]);
  %252 = cast(%251, dtype="float32");
  %253 = multiply(%252, meta[relay.Constant][67]);
  %254 = add(%253, meta[relay.Constant][68]);
  %255 = cast(%254, dtype="float32");
  %256 = clip(%255, a_min=0f, a_max=6f);
  %257 = nn.pad(%256, pad_width=[[0, 0], [0, 1], [0, 1], [0, 0]]);
  %258 = transpose(%257, axes=[0, 3, 1, 2]);
  %259 = layout_transform(%258, src_layout="NCHW", dst_layout="NCHW8c");
  %260 = layout_transform(meta[relay.Constant][69], src_layout="OIHW", dst_layout="OIHW1i8o");
  %261 = nn.contrib_depthwise_conv2d_NCHWc(%259, %260, strides=[2, 2], padding=[0, 0, 0, 0], groups=512, channels=512, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %262 = layout_transform(%261, src_layout="NCHW8c", dst_layout="NCHW");
  %263 = transpose(%262, axes=[0, 2, 3, 1]);
  %264 = cast(%263, dtype="float32");
  %265 = multiply(%264, meta[relay.Constant][70]);
  %266 = add(%265, meta[relay.Constant][71]);
  %267 = cast(%266, dtype="float32");
  %268 = clip(%267, a_min=0f, a_max=6f);
  %269 = transpose(%268, axes=[0, 3, 1, 2]);
  %270 = layout_transform(%269, src_layout="NCHW", dst_layout="NCHW8c");
  %271 = layout_transform(meta[relay.Constant][72], src_layout="OIHW", dst_layout="OIHW8i8o");
  %272 = nn.contrib_conv2d_NCHWc(%270, %271, padding=[0, 0, 0, 0], channels=1024, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %273 = layout_transform(%272, src_layout="NCHW8c", dst_layout="NCHW");
  %274 = transpose(%273, axes=[0, 2, 3, 1]);
  %275 = cast(%274, dtype="float32");
  %276 = multiply(%275, meta[relay.Constant][73]);
  %277 = add(%276, meta[relay.Constant][74]);
  %278 = cast(%277, dtype="float32");
  %279 = clip(%278, a_min=0f, a_max=6f);
  %280 = transpose(%279, axes=[0, 3, 1, 2]);
  %281 = layout_transform(%280, src_layout="NCHW", dst_layout="NCHW8c");
  %282 = layout_transform(meta[relay.Constant][75], src_layout="OIHW", dst_layout="OIHW1i8o");
  %283 = nn.contrib_depthwise_conv2d_NCHWc(%281, %282, padding=[1, 1, 1, 1], groups=1024, channels=1024, kernel_size=[3, 3], data_layout="NCHW8c", kernel_layout="OIHW1i8o", out_layout="NCHW8c");
  %284 = layout_transform(%283, src_layout="NCHW8c", dst_layout="NCHW");
  %285 = transpose(%284, axes=[0, 2, 3, 1]);
  %286 = cast(%285, dtype="float32");
  %287 = multiply(%286, meta[relay.Constant][76]);
  %288 = add(%287, meta[relay.Constant][77]);
  %289 = cast(%288, dtype="float32");
  %290 = clip(%289, a_min=0f, a_max=6f);
  %291 = transpose(%290, axes=[0, 3, 1, 2]);
  %292 = layout_transform(%291, src_layout="NCHW", dst_layout="NCHW8c");
  %293 = layout_transform(meta[relay.Constant][78], src_layout="OIHW", dst_layout="OIHW8i8o");
  %294 = nn.contrib_conv2d_NCHWc(%292, %293, padding=[0, 0, 0, 0], channels=1024, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %295 = layout_transform(%294, src_layout="NCHW8c", dst_layout="NCHW");
  %296 = transpose(%295, axes=[0, 2, 3, 1]);
  %297 = cast(%296, dtype="float32");
  %298 = multiply(%297, meta[relay.Constant][79]);
  %299 = add(%298, meta[relay.Constant][80]);
  %300 = cast(%299, dtype="float32");
  %301 = clip(%300, a_min=0f, a_max=6f);
  %302 = mean(%301, axis=[1, 2]);
  %303 = reshape(%302, newshape=[1, 1, 1, 1024]);
  %304 = transpose(%303, axes=[0, 3, 1, 2]);
  %305 = layout_transform(%304, src_layout="NCHW", dst_layout="NCHW8c") an internal invariant was violated while typechecking your program [16:03:10] /Users/alopez/Documents/Code/tvm/include/tvm/tir/expr.h:143: Check failed: a.dtype() == b.dtype(): TypeError: mismatched types

; ;
  %306 = layout_transform(meta[relay.Constant][81], src_layout="OIHW", dst_layout="OIHW8i8o");
  %307 = nn.contrib_conv2d_NCHWc(%305, %306, padding=[0, 0, 0, 0], channels=1000, kernel_size=[1, 1], data_layout="NCHW8c", kernel_layout="OIHW8i8o", out_layout="NCHW8c");
  %308 = layout_transform(%307, src_layout="NCHW8c", dst_layout="NCHW") an internal invariant was violated while typechecking your program [16:03:10] /Users/alopez/Documents/Code/tvm/src/relay/op/tensor/transform.cc:2328: Check failed: data != nullptr: 
; ;
  %309 = transpose(%308, axes=[0, 2, 3, 1]);
  %310 = add(%309, meta[relay.Constant][82]);
  %311 = reshape(%310, newshape=[1, 1000]);
  nn.softmax(%311)
}
// meta data omitted. you can use show_meta_data=True to include meta data

Looking at the original source code I think the error is around here:

    x = layers.Reshape(shape, name='reshape_1')(x)

    x = layers.Dropout(dropout, name='dropout')(x)

    x = layers.Conv2D(classes, (1, 1),

                      padding='same',

                      name='conv_preds')(x)

    x = layers.Reshape((classes,), name='reshape_2')(x)

As you can see the error happens almost at the end of the model. If I set opt_level=2 the error does not take place. By the way the dropout is removed in my PB file.