Deploying the tuned tflite graph with c++

yvn · February 25, 2019, 6:25am

After tuning the tflite graph and building it, we can have the .so using export_library. Now, to deploy it in c++, will taking this .so as an input be enough? I think some params extracted from the graph should also be sent as input for the execution, but how do we do that?

yvn · February 26, 2019, 1:06am

@eqy you have anything here?

eqy · February 26, 2019, 1:08am

I don’t have much experience using the C++ API, have you tried it and checked the results? If it looks anything like the Python API the parameters are not passed again.

yvn · February 26, 2019, 1:16am

In the example of how_to_deploy in the apps, there is just a simple example of vector addition i guess, in that case we have to take the tensors, because we know that there are only 3 tensors to be taken it is easy, but when it comes to the graph how should I take this step?
Like, how many tensors to be taken, which function should I be calling for executing the inference, what should be the input params etc. Can you give me some information about that so that I may try it.

eqy · February 26, 2019, 1:29am

Did you compile using NNVM or relay? The NNVM example is here https://github.com/dmlc/nnvm/blob/master/docs/how_to/deploy.md (looks like you may need to load parameters after all).

yvn · February 26, 2019, 1:37am

I was compiling it for tflite with relay!

eqy · February 26, 2019, 1:46am

Looking more closely, it looks like the actual running part is using the TVM runtime, so in theory it should be interchangeable if you used Relay.

yvn · February 26, 2019, 2:11am

Okay thanks will try that and let you know

yvn · February 26, 2019, 5:46am

can you mention someone who has more experience using c++ api?

masahi · February 26, 2019, 6:31am

you can take a look at this post. It doesn’t matter whether you use nnvm or relay.

yvn · February 26, 2019, 10:53am

Like there is a way to write params in the form of string using nnvm:

f_params.write(nnvm.compiler.save_param_dict(params))

Is there a way to write params in a dictionary file in the form of a string with relay also, so that the 3 files needed for deployment (.so, .json and .params), all 3 could be obtained?

masahi · February 27, 2019, 6:57am

You are in luck. It is just merged https://github.com/dmlc/tvm/pull/2620

yvn · February 27, 2019, 7:13am

thanks for the update!
seems like when I was saving the param_dict using nnvm then also it was getting built and deployed without any error. What is the difference between the two?

yvn · February 27, 2019, 10:54am

@masahi @eqy , there are certain things that I want to know
I want to deploy mobilenet_v1_224 tflite model on arm64 cpu target in c++. As discussed in the above replies I would need the,

graph.json
graph.so
graph.param
the .json and the .so, I am able to generate as they don’t have dependency on relay. But graph.param has relay dependency, which seems like it has been merged in git.However, I faced some new errors while compiling the python file after pulling the latest code from git described here (Unable to compile the tflite model with relay after pulling the latest code from remote). Could you share me some code or link to some patch where I can modify the code for my requirements, where any tflite model is deployed as an application, built in c++ on arm64 cpu target.
Thanks!