Build with opt_level=0 getting error qnn.requantize not registed TOpPattern not registered

siju-samuel · May 5, 2020, 4:30pm

  File "tf_tvm_interpreter.py", line 171, in run_test_person_detection
    tvm_output = run_tvm_graph(tflite_model_buf, img_data, 'input')

  File "tf_tvm_interpreter.py", line 63, in run_tvm_graph
    graph, lib, params = relay.build(mod, target, params=params)

  File "/home/siju/workspace/tvm/python/tvm/relay/build_module.py", line 251, in build
    graph_json, mod, params = bld_mod.build(mod, target, target_host, params)

  File "/home/siju/workspace/tvm/python/tvm/relay/build_module.py", line 120, in build
    self._build(mod, target, target_host)

  File "/home/siju/workspace/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 219, in __call__
    raise get_last_ffi_error()

tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (8) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::IndexedForwardGraph::Create(tvm::support::Arena*, tvm::RelayExpr const&)+0xfe) [0x7eff4c25f17e]
  [bt] (7) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::ExprVisitor::VisitExpr(tvm::RelayExpr const&)+0x7b) [0x7eff4c38dd7b]
  [bt] (6) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::ExprFunctor<void (tvm::RelayExpr const&)>::VisitExpr(tvm::RelayExpr const&)+0x92) [0x7eff4c17b242]
  [bt] (5) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::IndexedForwardGraph::Creator::VisitExpr_(tvm::relay::FunctionNode const*)+0x2bf) [0x7eff4c265c6f]
  [bt] (4) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::ExprVisitor::VisitExpr_(tvm::relay::FunctionNode const*)+0xe3) [0x7eff4c38a303]
  [bt] (3) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::ExprVisitor::VisitExpr(tvm::RelayExpr const&)+0x7b) [0x7eff4c38dd7b]
  [bt] (2) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::ExprFunctor<void (tvm::RelayExpr const&)>::VisitExpr(tvm::RelayExpr const&)+0x92) [0x7eff4c17b242]
  [bt] (1) /home/siju/workspace/tvm/build/libtvm.so(tvm::relay::IndexedForwardGraph::Creator::VisitExpr_(tvm::relay::CallNode const*)+0x461) [0x7eff4c262fe1]
  [bt] (0) /home/siju/workspace/tvm/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x7c) [0x7eff4bae5abc]
  File "/home/siju/workspace/tvm/include/tvm/ir/op.h", line 574
TVMError: Check failed: idx < data_.size() && data_[idx].second != 0: Attribute TOpPattern has not been registered for Operator qnn.requantize

@anijain2305 Can you please help me with this issue? I want to keep opt_level=0 because the tflite conv weights are int8 and bias is int32, But most of tvm param weights are int16 and int32 after op fusion. So because of this the param size is high in tvm compared to tflite.

I want to load the model in arduino with limited RAM and flash memory. TFlite is able to run and for tvm im getting memory issues.

anijain2305 · May 5, 2020, 4:59pm

We will have to run opt_level = 1 atleast because Legalize pass, which is necessary for QNN, runs at level 1.
The right way to solve this problem is to disable the upcasting. Currently, the weights are upcasted to int16 for ARM CPUs as it results in better performance for Raspberry Pi. We can selectively disable it.

github.com

apache/incubator-tvm/blob/95e06b3ec999da03e554cae7c81c257f3755fb41/python/tvm/relay/qnn/op/legalizations.py#L245-L257


@qnn_conv2d_legalize.register('arm_cpu')
def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
    # ARM prefers the dtypes to be same.
    if is_fast_int8_on_arm():
        return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d)
    return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.conv2d)


@qnn_dense_legalize.register('arm_cpu')
def _qnn_dense_legalize_arm_cpu(attrs, inputs, types):
    # ARM prefers the dtypes to be same.
    if is_fast_int8_on_arm():
        return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.dense)
    return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.nn.dense)

For now, you can set is_fast_int8_on_arm to True. And start from there.

siju-samuel · May 5, 2020, 6:50pm

@anjiang2016 Thanks a lot for your quick reply. I was able to resolve it. Now my model size is same as tflite.

emelife · September 28, 2020, 12:12pm

@anijain2305 I met the same problems,but my purpose is different. the reason for changing op_level=0 is that I want to print every intermediate layer output data, don’t need fuse op. do you know how to solve this or other print inter layer output data method ? thanks a lot.

anijain2305 · September 29, 2020, 5:21pm

Hi, you can use QNN canonicalize pass - [QNN] How to decompose qnn.requantize into relay-only operators