[c++ deploy] How to manage resources for multiple tvm instances in single application?

Hi there,

I have a c++ application and it will run multiple tvm instances in parallel, and I wish each tvm instance can use 4 cpus to optimize the execution time. (e.g: on a machine with 20 cores, I will run 5 tvm instances and each instances will use and only use their assigned 4 cores).

If I don’t need them to run in parallel, I can use environment variable e.g: export TVM_NUM_THREADS=4 to set the cpu usage for entire application, but I don’t know what would be the best practice to set such limitation for every instances (who may run in parallel) inside the application.

Any insights? Thanks in advance!

2 Likes