Memory leak running in cpu mode on windows[tvm 0.6.0]

Hi Experts,

As the title say,we found there seems to be memory leak in tvm_runtime.dll.

And bellow is the stack information generated by heob[https://sourceforge.net/projects/heob/]:

128 B * 29211 = 3739008 B (#34357) '28'
                           [malloc]
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB8BA76C   [TVMObjectTypeKey2Index]
      0x00007FFFCB8669AF   [TVMBackendParallelLaunch]
    0x00007FFFD5850000   VCOMP140.DLL
      0x00007FFFD58517A0   [_vcomp_fork]
      0x00007FFFD5851764   [_vcomp_fork]
      0x00007FFFD58582B5   [_vcomp_atomic_div_r8]
  128 B * 1365 = 174720 B (#34394) '3'
                           [malloc]
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB8BA76C   [TVMObjectTypeKey2Index]
      0x00007FFFCB8669AF   [TVMBackendParallelLaunch]
    0x00007FFFD5850000   VCOMP140.DLL
      0x00007FFFD58517A0   [_vcomp_fork]
      0x00007FFFD5851764   [_vcomp_fork]
      0x00007FFFD5858966   [_vcomp_atomic_div_r8]
      0x00007FFFD5851691   [_vcomp_fork]
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB866956   [TVMBackendParallelLaunch]
    **0x00007FFFC7780000   mnet12_1_640_480_llvm_lib.so**
      0x00007FFFC7792D4F
      0x00007FFFC7792A97
    0x00007FFFCB840000   tvm_runtime.dll
      0x00007FFFCB860C10   [public: virtual void tvm::runtime::ModuleNode::SaveToFile(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &)]
      0x00007FFFCB8615D7   [public: virtual void tvm::runtime::ModuleNode::SaveToFile(class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &,class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > const &)]
      0x00007FFFCB89886D   [public: class tvm::runtime::DeviceAPI & tvm::runtime::DeviceAPI::operator=(class tvm::runtime::DeviceAPI const &)]
      0x00007FFFCB89D5EE   [public: class tvm::runtime::DeviceAPI & tvm::runtime::DeviceAPI::operator=(class tvm::runtime::DeviceAPI const &)]
    0x00007FFFDD5D0000   tvmfacer.dll
      0x00007FFFDD5E3DB6   functional:279 [std::_Func_class<void,tvm::runtime::TVMArgs,tvm::runtime::TVMRetValue * __ptr64>::operator()]
      0x00007FFFDD5F1630   packed_func.h:1219 [tvm::runtime::PackedFunc::operator()<>]
      **0x00007FFFDD5E1394   engine.h:220** [Engine<std::vector<cv::cuda::GpuMat,std::allocator<cv::cuda::GpuMat> >,std::vector<std::vector<FaceRect,std::allocator<FaceRect> >,std::allocator<std::vector<FaceRect,std::allocator<FaceRect> > > > >::DoInference]
      0x00007FFFDD5E018F   tvmfacer.cpp:272 [TVM_DetectFaces]
    0x00007FF6611A0000   Inferas.exe
      0x00007FF6611A5C88   inferas.cpp:455 [test_tvm_cam_facedetector]
      0x00007FF6611A9C4D   inferas.cpp:1058 [<lambda_3e530796f69c6d0c5eddbddcd658e69f>::operator()]
      0x00007FF6611B421A   type_traits:1375 [std::_Invoker_functor::_Call<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >]
      0x00007FF6611B3F1A   type_traits:1443 [std::invoke<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >]
      0x00007FF6611B3376   xthread:240 [std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >,std::default_delete<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> > > > >::_Execute<0>]
      0x00007FF6611B2794   xthread:247 [std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >,std::default_delete<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> > > > >::_Run]
      0x00007FF6611B2522   xthread:232 [std::_LaunchPad<std::unique_ptr<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> >,std::default_delete<std::tuple<<lambda_3e530796f69c6d0c5eddbddcd658e69f> > > > >::_Go]
      0x00007FF6611A3347   xthread:209 [std::_Pad::_Call_func]
    0x00007FFFF0210000   ucrtbase.dll
      0x00007FFFF0230E71   [_beginthreadex]

And mnet12_1_640_480_llvm_lib.so is the library generated when compiling the model.

And the Engine.h:220 is calling the tvm “run” function and do the inference.

Any suggestion would be great, thanks in advance.

It seems that USE_OPENMP leads to this issue, as describe in Thread hangs there when unload the module

After turn this option to OFF, there is no memory leak any more.

Hi, zhigaowu.

I have found the undeleted object in thread_pool.cc of runtime.

It passes in case of USE_OPENMP=ON. Could you try to add delete[] sync_counter on the end of block?

sorry for late response because of The Spring Festival.

And i cann’t modify this code logic, because i have no experience with MP and i don’t understand the usage of this snippet in the whole context.

@zhigaowu Thanks for reporting the issue. I pushed a fix for the memory leak. Could you check again and see if it fixes the issue?

We do not use this feature currently. we will test this fix and update this post after updating the code and reusing this feature.

Thanks .