How to deploy NNVM models in C++

KenArrari · November 6, 2018, 12:01am

dmlc/tvm/blob/master/docs/deploy/nnvm.md

# Deploy NNVM Modules
NNVM compiled modules are fully embedded in TVM runtime as long as ```GRAPH_RUNTIME``` option
is enabled in tvm runtime.


In a nutshell, we will need three items to deploy a compiled module.
Checkout our tutorials on getting started with NNVM compiler for more details.

- The graph json data which contains the execution graph.
- The tvm module library of compiled functions.
- The parameter blobs for stored parameters.

We can then use TVM's runtime API to deploy the compiled module.
Here is an example in python.

```python
import tvm

# tvm module for compiled functions.
loaded_lib = tvm.module.load("deploy.so")

This file has been truncated. show original

I tried to follow this example and for the C++, and I immedately get an error for not being able to find dlpack.
After manually fixing all of the include directories, I still get include errors for some of the included in these .h files.

Is there something else missing from my TVM installation? All of the Python examples work fine otherwise.

masahi · November 6, 2018, 1:33am

you need to include headers from dmlc-core, dlpack, and tvm.

KenArrari · November 6, 2018, 4:26pm

It is being included, but the includes fail.
Even fixing them manually,:

#include “…/…/…/tvm/dlpack/include/dlpack/dlpack.h”
#include “…/…/…/tvm/include/tvm/runtime/module.h”
#include “…/…/…/tvm/include/tvm/runtime/registry.h”
#include “…/…/…/tvm/include/tvm/runtime/packed_func.h”

fails because these includes need includes. I think I’m just having an issue with installing the headers correctly.

masahi · November 7, 2018, 12:22am

you just need to put include dirs of tvm, dlpack, and dmlc-core in the same directory. You don’t need to install anything.

The directory structure should look something like this:

dlpack/
dmlc/
tvm/
dlpack/dlpack.h
dmlc/any.h
dmlc/array_view.h
dmlc/base.h
dmlc/blockingconcurrentqueue.h
dmlc/common.h
dmlc/concurrency.h
dmlc/concurrentqueue.h
dmlc/config.h
dmlc/data.h
dmlc/endian.h
dmlc/input_split_shuffle.h
dmlc/io.h
dmlc/json.h
dmlc/logging.h
dmlc/lua.h
dmlc/memory.h
dmlc/memory_io.h
dmlc/omp.h
dmlc/optional.h
dmlc/parameter.h
dmlc/recordio.h
dmlc/registry.h
dmlc/serializer.h
dmlc/thread_group.h
dmlc/thread_local.h
dmlc/threadediter.h
dmlc/timer.h
dmlc/type_traits.h
tvm/api_registry.h
tvm/arithmetic.h
tvm/base.h
tvm/buffer.h
tvm/build_module.h
tvm/c_dsl_api.h
tvm/channel.h
tvm/codegen.h
tvm/expr.h
tvm/ir.h
tvm/ir_functor_ext.h
tvm/ir_mutator.h
tvm/ir_operator.h
tvm/ir_pass.h
tvm/ir_visitor.h
tvm/logging.h
tvm/lowered_func.h
tvm/operation.h
tvm/packed_func_ext.h
tvm/runtime/
tvm/schedule.h
tvm/schedule_pass.h
tvm/target_info.h
tvm/tensor.h
tvm/tensor_intrin.h
tvm/tvm.h
tvm/runtime/c_backend_api.h
tvm/runtime/c_runtime_api.h
tvm/runtime/device_api.h
tvm/runtime/module.h
tvm/runtime/ndarray.h
tvm/runtime/packed_func.h
tvm/runtime/registry.h
tvm/runtime/serializer.h
tvm/runtime/threading_backend.h
tvm/runtime/util.h

KenArrari · November 8, 2018, 3:36am

In the same directory as what I’m compiling?

I’m confused by that dir structure because all of those folders are already sub-repositories inside of tvm. (except with dmlc-core instead of dmlc)

masahi · November 8, 2018, 4:35am

The include directory can be in anywhere, you just need to set up an include path appropriately (by cmake, for example).

KenArrari · November 8, 2018, 5:09pm

thank you, I got it to work with this command:

g++ -std=c++11 -O2 -fPIC -I$TVM_PATH/tvm/include -I$TVM_PATH/tvm/dmlc-core/include -I$TVM_PATH/tvm/dlpack/include -o lib/cpp_deploy_normal cpp_deploy.cc lib/test_addone_sys.o -L$TVM_PATH/tvm/build -ldl -lpthread -ltvm_runtime

there’s also a helpful script in tvm/apps/howto_deploy/run_example.sh

However, it doesn’t actually execute due to undefined references

/tvm/build/libtvm_runtime.so: undefined reference to pthread_create' /tvm/build/libtvm_runtime.so: undefined reference todlopen’
/tvm/build/libtvm_runtime.so: undefined reference to dlclose' /tvm/build/libtvm_runtime.so: undefined reference todlerror’
/tvm/build/libtvm_runtime.so: undefined reference to dlsym' /tvm/build/libtvm_runtime.so: undefined reference topthread_setaffinity_np’

wk738126046 · December 19, 2018, 8:22am

Hi,
I’m writing a c++ test application to test a inference using mxnet + resnet18.
reference: https://docs.tvm.ai/deploy/nnvm.html.
the building works but seems it shows wrong result below:
The maximum position in output vector is: 0
data changes as fellow:
cat.png -> resize(224,224,3)(RGB) ->cat.bin

how could I fix it??

kaka7 · December 20, 2019, 11:51am

i got the same problem ,the reason is that when we load the input data( cat.bin) which must be saved astype float32 (type is float32), numpy_data.astype(float32).tofile(cat.bin) in python ,then load data in c