Incrrect use of NDArray leads to memory leak

sleepwalker2017 · April 22, 2020, 2:39pm

I see the following segment, it gets pointer from an NDArray in an incorrect way.

  auto pa = (float*)a_val.ToDLPack()->dl_tensor.data;
  auto pb = (float*)b_val.ToDLPack()->dl_tensor.data;
  auto pc = (float*)c_val.ToDLPack()->dl_tensor.data;

This code segment comes from here:

https://github.com/apache/incubator-tvm/blob/master/tests/cpp/build_module_test.cc line 158.

here is the definition of ToDLPack

static DLManagedTensor* ToDLPack(NDArray::Container* from) {
    CHECK(from != nullptr);
    DLManagedTensor* ret = new DLManagedTensor();
    ret->dl_tensor = from->dl_tensor;
    ret->manager_ctx = from;
    from->IncRef();
    ret->deleter = NDArrayDLPackDeleter;
    return ret;
  }

ToDLPack() created an object and called IncRef() which increased the ref_counter, and then it returns the pointer , but no outer DecRef() is called, the returned pointer is just lost. So the container which called ToDLPack is never destructed.

Seems each invocation of auto p = ndarray.ToDLPack() should be followed with a p->deleter().

And I also noticed that , many internal codes in tvm uses NDArray in this way. potential mem leak ?

The right way to get data in an NDArray is a_val->data.

inline const DLTensor* NDArray::operator->() const {
  return &(data_->dl_tensor);
}

tqchen · April 22, 2020, 3:31pm

According to the DLPack convention, it is the duty of the holder DLManagedTensor to call the deleter of that, so it is expected behavior(in ToDLPack).

Great catch about the incorrect usage of a_val.ToDLPack()->dl_tensor.data, they should be changed in a_val->data like you suggested as we are not trying to get a retainable container out, please feel free to send a PR

tqchen · April 22, 2020, 3:25pm

cc @jroesch @zhiics @haichen

tqchen · April 22, 2020, 6:03pm

haichen · April 22, 2020, 7:16pm

Prepared a fix for this