I’m inspecting graph_runtime.cc and try to understand the internal inference data flow. I’ll deploy in server-class CPU.
My understanding is that graph_runtime.creat function bind Node with the corresponding operator when load graph.json and deploy.so.
When invoking the below function, the graph is evaluated in sequential.
void Run() {
// setup the array and requirements.
for (size_t i = 0; i < op_execs_.size(); ++i) {
if (op_execs_[i]) op_execs_i;
}
}
How to make graph_runtime run parallel in graph level?