The ABI compatibility of `PackedFunc`

Hi! there.
I met a problem in MXTVMBridge function of MXNet. The problem of ABI compatibility in MXTVMBridge

Then I wrote a minimum reproducible example to test MXTVMBridge.
In this example, tvm_packed_func.h is simplyfied from TVM.

Steps to reproduce

  1. clone the example
  2. change the path of libmxnet.so in the line 55 of the code.
  3. make
  4. ./test

The example works when MXNet is built from source by myself, but it doesn’t work when MXNet is installed by PIP.

@tqchen said:

Because the bridge uses c++ std::function, it requires a common ABI between the two in order to make things work, so build from source is indeed the best way

Sometimes other programs will call PackedFunc of TVM, so it is unavoidable to meet the problem of ABI compatibility.
Is it possible to provide an API in TVM like this?
It may be available to avoid the problem of ABI compatibility. : )
In include/tvm/runtime/packed_func.h,

class PackedFunc {
 public:
  using FType = std::function<void(TVMArgs args, TVMRetValue* rv)>;
  PackedFunc() {
    SetRemoteCallPacked();
  };
  explicit PackedFunc(FType body) : body_(body) {
    SetRemoteCallPacked();
  }
  void CallPacked(TVMArgs args, TVMRetValue* rv) const {
    _RemoteCallPacked(this, args, rv);
  }
  template <typename... Args>
  inline TVMRetValue operator()(Args&&... args) const {
    const int kNumArgs = sizeof...(Args);
    const int kArraySize = kNumArgs > 0 ? kNumArgs : 1;
    TVMValue values[kArraySize];
    int type_codes[kArraySize];
    detail::for_each(TVMArgsSetter(values, type_codes),
                     std::forward<Args>(args)...);
    TVMRetValue rv;
    _RemoteCallPacked(this, TVMArgs(values, type_codes, kNumArgs), &rv);
    return rv;
  }
 private:
  void SetRemoteCallPacked() {
    _RemoteCallPacked = [](const PackedFunc* func, TVMArgs args, TVMRetValue* rv) {
      func->body_(args, rv);
    };
  }
  void (*_RemoteCallPacked)(const PackedFunc* func, TVMArgs args, TVMRetValue* rv);
 private:
  FType body_;
...
};

Thanks!

The problem of C++ ABI occurs when we build dll using different version of compiler(gcc, clang) because their internal representation of the std::function is different.

The only way to get around it is to use the C API, which is stable across the dll boundaries. In particular, the PackedFunc that crosses ABI boundary need to be registered as CPackedFunc signature. Neve-the-less it is a bit tricky to do so

1 Like

Thanks!
Is it avaiable to get around the problem of C++ ABI in my code?
The body_ function will be executed in the instruction of the program which generates PackedFunc rather than the caller, so there is no any problem of C++ ABI in FType body_.

Although the internal name of _RemoteCallPacked may be different in different version of compiler, the offsets of _RemoteCallPacked in the class PackedFunc are the same.

The general idea is OK, but that again boils down to C API. Your example code cannot work because you cannot assign lambda to function ptr

1 Like

In fact, his code should work. Because the lambda _RemoteCallPacked of his code doesn’t capture anything, According to C++ standard, it could assign to function pointer. C++ standard $5.1.2 said:

The closure type for a lambda-expression with no lambda-capture has a public non-virtual non-explicit const conversion function to pointer to function having the same parameter and return types as the closure type’s function call operator. The value returned by this conversion function shall be the address of a function that, when invoked, has the same effect as invoking the closure type’s function call operator.

1 Like

@tqchen @FrozenGene
Thank you!
Yes, the lambda function without capturing anything can be converted to function pointer.

It is also useful for TVM to use MXTVMBridge, without the need of rebuilding the source of MXNet.

OK, thanks @FrozenGene for clarification. I think it might work for simple call. However, it makes intrusive change to the PackedFunc as a pure C object, which I am not sure is the most ideal case. It also did not handle a natural deletion of the code.

To make things really compatible, I think we could make use of the following code in the MX side.

Great! I will try it.
But I’m not sure that the outside program can call the PackedFunc object, which is returned by WrapAsyncCall of MXNet.

In src/nnvm/tvm_bridge.cc of MXNet,

void WrapAsyncCall(TVMArgs wrap_args, TVMRetValue* wrap_rv) {
  ...
  *wrap_rv = PackedFunc(wrapped);
}

Update:
TVMFuncCall can call the PackedFunc object.

Is it avaiable to expose the three APIs TVMFuncCreateFromCFunc, TVMFuncCall, and TVMFuncFree in MXNet?

Basically in the MXNet side, we might need to call into these three functions, however, inorder to make it dependency free, we might want to pass the function pointer of these functions into MXNetBridge

1 Like

I wonder if it is okay for me to add MXTVMFuncCreateFromCFunc and MXTVMFuncCall APIs in mxnet/src/nnvm/tvm_bridge.cc, since the external program needs to create a PackedFunc instance to pass into MXTVMBridge, then uses MXTVMFuncCall to call the returned functions from MXTVMBridge.

Thanks!

sorry for the delayed reply, how about you do a quick proptype to see if the prof of concept work, and we move on from there. It is not hard to test. We can compile mxnet using gcc while tvm using clang and see if things can run

I will try. Thank you!

I have written MXTVMFuncCreateFromCFunc and MXTVMFuncCall APIs in MXNet.

However, I couldn’t test it since gcc-4 couldn’t be installed in my Arch Linux.

I found the test code work when MXNet is built with 8.2.1, and compile the test code with gcc 8.2.1, gcc 7.4.1, gcc 6.4.1 or clang 7.0.1.

Maybe the ABI is related to libstdc++.

The implementation of the header file <functional> is different among different compiler.
I replace <functional> to the corresponding version to address the ABI compatibility problem.

Example:

<functional> header file is GPL license. We should avoid modifying and including it into TVM project.

GCC 4 and GCC 5 has different abi. Since GCC 5, it has dual abi and can control it using _GLIBCXX_USE_CXX11_ABI macro. Could you provide more detail of ABI issue? Maybe we could find a better way.

2 Likes

Thank you for your reminding. I will try the macro and provide more detail.

My recommendation is to stay on C ABI and use the CPackedFunc as a bridge.

2 Likes

I found the type _Invoker_type in <functional> is different between gcc4 and gcc8.
In gcc4,

    typedef _Res (*_Invoker_type)(const _Any_data&, _ArgTypes...);

However in gcc8,

    using _Invoker_type = _Res (*)(const _Any_data&, _ArgTypes&&...);

It is a rvalue reference for _ArgTypes.

CPackedFunc is better to be the bridge.