Symbol visibility, linking and double loading


#1

Background. There are cases that we may want to statically link part of LLVM into libtvm.so, for example, when we want to ship TVM as a JIT compiler without having users to install LLVM dependency.

Problem. However, this would bring trouble when we load LLVM twice. Even if we compile with default visibility as hidden, i.e. configuration below:

cmake .. -DHIDE_PRIVATE_SYMBOLS=ON -DUSE_LLVM="/usr/bin/llvm-config-8 --ignore-libllvm"

A minimal example is:

$ python -c "import tvm, ctypes; ctypes.cdll.LoadLibrary('/usr/lib/x86_64-linux-gnu/libLLVM-8.so')"
: CommandLine Error: Option 'help-list' registered more than once!
LLVM ERROR: inconsistency in registered CommandLine options

Even if we strip the shared library, the issue didn’t disappear.

Other examples are mentioned in another thread of TVM forum.

Root cause. LLVM is loaded twice, once in libtvm.so, another in ctypes.cdll.Loadlibrary. Although we did nothing, global static variables are being initialized during library loading, including the global registries - whose entries are initialized twice, which causes the aforementioned error.

Potential solution. In many cases, we could ask the linker to truly hide those symbols. For example, in many linux, we can make them even invisible to nm:

cmake .. -DCMAKE_SHARED_LINKER_FLAGS="-Wl,--exclude-libs,ALL"
         -DHIDE_PRIVATE_SYMBOLS=ON
         -DUSE_LLVM="/usr/bin/llvm-config-8 --ignore-libllvm" 

In this case, at least the aforementioned commands truly works.

$ python -c "import tvm, ctypes; a = ctypes.cdll.LoadLibrary('/usr/lib/x86_64-linux-gnu/libLLVM-8.so'); print(tvm._ffi.base._LIB); print(a)"
<CDLL '/my/path/to/tvm/build/libtvm.so', handle 55f1b965ce60 at 0x7f34184284d0>
<CDLL '/usr/lib/x86_64-linux-gnu/libLLVM-8.so', handle 55f1b9778d70 at 0x7f3418911d10>

And nm does see much fewer symbols

# before
$ nm -gC build/libtvm.so | wc -l
30615
# after
$ nm -gC build/libtvm.so | wc -l
1210

My question.

Is this a working solution? I played with the topi unittests, and everything seem to work, but I am not 100% confident.

What about other platforms? How could we achieve this in MacOS and Windows. I am not super familiar in this case.

CC: @tqchen @haichen @Laurawly


#2

I did some experiments with Mac OS, and it works fine even if we don’t have to tell the linker to do “-Wl,–exclude-libs,ALL”. However, i still have no idea how to hide those hidden symbols from nm. Any ideas?


#3

Also cc @jroesch , would be an interesting way for packaging the lib


#4

The downside of doing this is that it somehow hides stack trace of un-exposed symbols, but I would not expect entry-level users to understand stacktrace anyway


#5

This solves the issue here OpenCL Runtime error