Because TVM GraphRuntime does not provide control-flows, we have to separate our model to two parts. While we need to share parameters between them, to save memory usage. However, there are two issues we need address first,
-
GraphRuntime will automatically allocate memory when creating the module (GraphRuntime::SetupStorage). How could we specify this entry should be allocated later?
-
“set_input” will always copy the parameter (DLTensor) into data_entry, we need a set function to accept a NDArray, so the actual storage can be shared. This is relatively easy to add a new PackedFunc.
Could anyone give us some advice? Thanks!