[NNVM] Fail to work with MXNet


#1

I was trying to evaluate the following script
vta/tutorials/frontend/deploy_resnet_on_vta.py

However, when I checked out the latest version of MXNet and TVM, I got the following error message repeatly :

terminate called after throwing an instance of 'dmlc::Error'
  what():  [19:35:43] /home/liangfu/workspace/tvm_upstream/nnvm/include/nnvm/op.h:473: Check failed: pmap->type() == typeid(OpMap<ValueType>): Attribute FInferShape of operator resize is registered as inconsistent types previously N4nnvm5OpMapISt8functionIFbRKNS_9NodeAt
trsEPSt6vectorIN5mxnet6TShapeESaIS7_EESA_EEEE current N4nnvm5OpMapISt8functionIFbRKNS_9NodeAttrsEPSt6vectorINS_6TShapeESaIS6_EES9_EEEE
Stack trace:
  [bt] (0) /home/liangfu/workspace/tvm_upstream/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x3b) [0x7f92d765d1a9]
  [bt] (1) /home/liangfu/workspace/tvm_upstream/nnvm/python/nnvm/../../../build/libnnvm_compiler.so(nnvm::Op::set_attr<std::function<bool (nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShape, std::allocator<nnvm::T
Shape> >*)> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<bool (nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*)> const
&, int)::{lambda(dmlc::any*)#1}::operator()(dmlc::any*) const+0x1d0) [0x7f92d1bfd59e]
  [bt] (2) /home/liangfu/workspace/tvm_upstream/nnvm/python/nnvm/../../../build/libnnvm_compiler.so(std::_Function_handler<void (dmlc::any*), nnvm::Op::set_attr<std::function<bool (nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::v
ector<nnvm::TShape, std::allocator<nnvm::TShape> >*)> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<bool (nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShap
e, std::allocator<nnvm::TShape> >*)> const&, int)::{lambda(dmlc::any*)#1}>::_M_invoke(std::_Any_data const&, dmlc::any*&&)+0x3a) [0x7f92d1c039bb]
  [bt] (3) /home/liangfu/workspace/tvm_upstream/nnvm/python/nnvm/../../../build/libnnvm_compiler.so(std::function<void (dmlc::any*)>::operator()(dmlc::any*) const+0x49) [0x7f92d1b75795]
  [bt] (4) /home/liangfu/workspace/tvm_upstream/nnvm/python/nnvm/../../../build/libnnvm_compiler.so(nnvm::Op::UpdateAttrMap(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void (dmlc::any*)>)+0xe0) [0x7f92d1b74706]
  [bt] (5) /home/liangfu/workspace/tvm_upstream/nnvm/python/nnvm/../../../build/libnnvm_compiler.so(nnvm::Op& nnvm::Op::set_attr<std::function<bool (nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShape, std::allocat
or<nnvm::TShape> >*)> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<bool (nnvm::NodeAttrs const&, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> >*, std::vector<nnvm::TShape, std::allocator<nnvm::TShape> 
>*)> const&, int)+0x164) [0x7f92d1bfdaa4]
  [bt] (6) /home/liangfu/workspace/tvm_upstream/nnvm/python/nnvm/../../../build/libnnvm_compiler.so(+0x52b2e4) [0x7f92d1bf12e4]
  [bt] (7) /home/liangfu/workspace/tvm_upstream/nnvm/python/nnvm/../../../build/libnnvm_compiler.so(+0x52b7c3) [0x7f92d1bf17c3]
  [bt] (8) /lib64/ld-linux-x86-64.so.2(+0xf37a) [0x7f92fd0eb37a]

, and I have located the error takes place when I’m importing vta module. And more specifically, when the vta module is trying to import nnvm internally, it raises the above error.

When I switch to import vta before mxnet, I got a similar error message :

terminate called after throwing an instance of 'dmlc::Error'
  what():  [19:48:58] /home/liangfu/workspace/mxnet/3rdparty/tvm/nnvm/include/nnvm/op.h:473: Check failed: pmap->type() == typeid(OpMap<ValueType>): Attribute FInferShape of operator _npi_multinomial is registered as inconsistent types previously N4nnvm5OpMapISt8functio
nIFbRKNS_9NodeAttrsEPSt6vectorINS_6TShapeESaIS6_EES9_EEEE current N4nnvm5OpMapISt8functionIFbRKNS_9NodeAttrsEPSt6vectorIN5mxnet6TShapeESaIS7_EESA_EEEE
Stack trace:
  [bt] (0) /home/liangfu/workspace/tvm_upstream/build/libtvm.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x3b) [0x7f726f6f51a9]
  [bt] (1) /home/liangfu/.local/lib/python3.7/site-packages/mxnet-1.6.0-py3.7.egg/mxnet/libmxnet.so(nnvm::Op::set_attr<std::function<bool (nnvm::NodeAttrs const&, std::vector<mxnet::TShape, std::allocator<mxnet::TShape> >*, std::vector<mxnet::TShape, std::allocator<mxne
t::TShape> >*)> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<bool (nnvm::NodeAttrs const&, std::vector<mxnet::TShape, std::allocator<mxnet::TShape> >*, std::vector<mxnet::TShape, std::allocator<mxnet::TShape> >*
)> const&, int)::{lambda(dmlc::any*)#1}::operator()(dmlc::any*) const+0x15a) [0x7f7265b0b71a]
  [bt] (2) /home/liangfu/.local/lib/python3.7/site-packages/mxnet-1.6.0-py3.7.egg/mxnet/libmxnet.so(nnvm::Op::UpdateAttrMap(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void (dmlc::any*)>)+0xb9) [0x7f7268ceb8e9]
  [bt] (3) /home/liangfu/.local/lib/python3.7/site-packages/mxnet-1.6.0-py3.7.egg/mxnet/libmxnet.so(nnvm::Op& nnvm::Op::set_attr<std::function<bool (nnvm::NodeAttrs const&, std::vector<mxnet::TShape, std::allocator<mxnet::TShape> >*, std::vector<mxnet::TShape, std::allo
cator<mxnet::TShape> >*)> >(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<bool (nnvm::NodeAttrs const&, std::vector<mxnet::TShape, std::allocator<mxnet::TShape> >*, std::vector<mxnet::TShape, std::allocator<mxnet::
TShape> >*)> const&, int)+0x12b) [0x7f7265b01eeb]
  [bt] (4) /home/liangfu/.local/lib/python3.7/site-packages/mxnet-1.6.0-py3.7.egg/mxnet/libmxnet.so(+0x5372ac) [0x7f72659da2ac]
  [bt] (5) /lib64/ld-linux-x86-64.so.2(+0xf37a) [0x7f727d9cf37a]
  [bt] (6) /lib64/ld-linux-x86-64.so.2(+0xf476) [0x7f727d9cf476]
  [bt] (7) /lib64/ld-linux-x86-64.so.2(+0x132d3) [0x7f727d9d32d3]
  [bt] (8) /lib/x86_64-linux-gnu/libc.so.6(_dl_catch_exception+0x6f) [0x7f727d4beb2f]

It seems to me that the error take place in the incompatible definition in the FInferShape attribute between current nnvm and the nnvm that has been used in MXNet.

Please leave your idea to resolve this.


#2

This issue can be temporarily fixed by reverting to MXNet v1.3.1, because later versions of MXNet have been using ::mxnet::TShape in defining attributes of operators, instead of using nnvm::TShape