Memory leak running in cpu mode on windows[tvm 0.6.0]

Hi Experts,

As the title say,we found there seems to be memory leak in tvm_runtime.dll.

And bellow is the stack information generated by heob[https://sourceforge.net/projects/heob/]:

128 B * 29211 = 3739008 B (#34357) '28'
                           [malloc]
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB8BA76C   [TVMObjectTypeKey2Index]
      0x00007FFFCB8669AF   [TVMBackendParallelLaunch]
    0x00007FFFD5850000   VCOMP140.DLL
      0x00007FFFD58517A0   [_vcomp_fork]
      0x00007FFFD5851764   [_vcomp_fork]
      0x00007FFFD58582B5   [_vcomp_atomic_div_r8]
  128 B * 1365 = 174720 B (#34394) '3'
                           [malloc]
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB8BA76C   [TVMObjectTypeKey2Index]
      0x00007FFFCB8669AF   [TVMBackendParallelLaunch]
    0x00007FFFD5850000   VCOMP140.DLL
      0x00007FFFD58517A0   [_vcomp_fork]
      0x00007FFFD5851764   [_vcomp_fork]
      0x00007FFFD5858966   [_vcomp_atomic_div_r8]
      0x00007FFFD5851691   [_vcomp_fork]
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB866956   [TVMBackendParallelLaunch]
    **0x00007FFFC7780000   mnet12_1_640_480_llvm_lib.so**
      0x00007FFFC7792D4F
      0x00007FFFC7792A97
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB860C10   [public: virtual void tvm::runtime::ModuleNode::SaveToFile(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &)]
      0x00007FFFCB8615D7   [public: virtual void tvm::runtime::ModuleNode::SaveToFile(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &)]
      0x00007FFFCB89886D   [public: class tvm::runtime::DeviceAPI & tvm::runtime::DeviceAPI::operator=(class tvm::runtime::DeviceAPI const &)]
      0x00007FFFCB89D5EE   [public: class tvm::runtime::DeviceAPI & tvm::runtime::DeviceAPI::operator=(class tvm::runtime::DeviceAPI const &)]
    0x00007FFFDD5D0000   tvmfacer.dll
      0x00007FFFDD5E3DB6   functional:279 [std::_Func_class<void,tvm::runtime::TVMArgs,tvm::runtime::TVMRetValue * __ptr64>::operator()]
      0x00007FFFDD5F1630   packed_func.h:1219 [tvm::runtime::PackedFunc::operator()<>]
      **0x00007FFFDD5E1394   engine.h:220** [Engine<std::vector<cv::cuda::GpuMat,std::allocator<cv::cuda::GpuMat> >,std::vector<std::vector<FaceRect,std::allocator<FaceRect> >,std::allocator<std::vector<FaceRect,std::allocator<FaceRect> > > > >::DoInference]
      0x00007FFFDD5E018F   tvmfacer.cpp:272 [TVM_DetectFaces]
    0x00007FF6611A0000   Inferas.exe
      0x00007FF6611A5C88   inferas.cpp:455 [test_tvm_cam_facedetector]
      0x00007FF6611A9C4D   inferas.cpp:1058 [<lambda_3e530796f69c6d0c5eddbddcd658e69f>::operator()]
      0x00007FF6611B421A   type_traits:1375 [std::_Invoker_functor::_Call<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >]
      0x00007FF6611B3F1A   type_traits:1443 [std::invoke<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >]
      0x00007FF6611B3376   xthread:240 [std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >,std::default_delete<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> > > > >::_Execute<0>]
      0x00007FF6611B2794   xthread:247 [std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >,std::default_delete<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> > > > >::_Run]
      0x00007FF6611B2522   xthread:232 [std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >,std::default_delete<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> > > > >::_Go]
      0x00007FF6611A3347   xthread:209 [std::_Pad::_Call_func]
    0x00007FFFF0210000   ucrtbase.dll
      0x00007FFFF0230E71   [_beginthreadex]

And mnet12_1_640_480_llvm_lib.so is the library generated when compiling the model.

And the Engine.h:220 is calling the tvm “run” function and do the inference.

Any suggestion would be great, thanks in advance.

It seems that USE_OPENMP leads to this issue, as describe in Thread hangs there when unload the module

After turn this option to OFF, there is no memory leak any more.

Hi, zhigaowu.

I have found the undeleted object in thread_pool.cc of runtime.

It passes in case of USE_OPENMP=ON. Could you try to add delete[] sync_counter on the end of block?

sorry for late response because of The Spring Festival.

And i cann’t modify this code logic, because i have no experience with MP and i don’t understand the usage of this snippet in the whole context.

@zhigaowu Thanks for reporting the issue. I pushed a fix for the memory leak. Could you check again and see if it fixes the issue?