Deploying the tuned tflite graph with c++


#1

After tuning the tflite graph and building it, we can have the .so using export_library. Now, to deploy it in c++, will taking this .so as an input be enough? I think some params extracted from the graph should also be sent as input for the execution, but how do we do that?


#2

@eqy you have anything here?


#3

I don’t have much experience using the C++ API, have you tried it and checked the results? If it looks anything like the Python API the parameters are not passed again.


#4

In the example of how_to_deploy in the apps, there is just a simple example of vector addition i guess, in that case we have to take the tensors, because we know that there are only 3 tensors to be taken it is easy, but when it comes to the graph how should I take this step?
Like, how many tensors to be taken, which function should I be calling for executing the inference, what should be the input params etc. Can you give me some information about that so that I may try it.


#5

Did you compile using NNVM or relay? The NNVM example is here https://github.com/dmlc/nnvm/blob/master/docs/how_to/deploy.md (looks like you may need to load parameters after all).


#6

I was compiling it for tflite with relay!


#7

Looking more closely, it looks like the actual running part is using the TVM runtime, so in theory it should be interchangeable if you used Relay.


#8

Okay thanks will try that and let you know


#9

can you mention someone who has more experience using c++ api?


#10

you can take a look at this post. It doesn’t matter whether you use nnvm or relay.


#11

Like there is a way to write params in the form of string using nnvm:

f_params.write(nnvm.compiler.save_param_dict(params))

Is there a way to write params in a dictionary file in the form of a string with relay also, so that the 3 files needed for deployment (.so, .json and .params), all 3 could be obtained?


#12

You are in luck. It is just merged https://github.com/dmlc/tvm/pull/2620


#13

thanks for the update!
seems like when I was saving the param_dict using nnvm then also it was getting built and deployed without any error. What is the difference between the two?


#14

@masahi @eqy , there are certain things that I want to know
I want to deploy mobilenet_v1_224 tflite model on arm64 cpu target in c++. As discussed in the above replies I would need the,

  1. graph.json
  2. graph.so
  3. graph.param
    the .json and the .so, I am able to generate as they don’t have dependency on relay. But graph.param has relay dependency, which seems like it has been merged in git.However, I faced some new errors while compiling the python file after pulling the latest code from git described here (Unable to compile the tflite model with relay after pulling the latest code from remote). Could you share me some code or link to some patch where I can modify the code for my requirements, where any tflite model is deployed as an application, built in c++ on arm64 cpu target.
    Thanks!