tvm.Module.export_library when target is llvm

Oewyn · October 17, 2018, 4:21pm

I’m running into some problems trying to save a pre-compiled nnvm to a file and load it back again:

File “/home/agsim/repos/agsim/install/python/agsim/utils/tvm_helper.py”, line 35, in get_tvm_graph_runtime
lib.export_library(target_dir+‘lib.o’)
File “/home/agsim/virtualenvs/py3/lib/python3.5/site-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/module.py”, line 121, in export_library
fcompile(file_name, files, **kwargs)
File “/home/agsim/virtualenvs/py3/lib/python3.5/site-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/contrib/cc.py”, line 33, in create_shared
_linux_shared(output, objects, options, cc)
File “/home/agsim/virtualenvs/py3/lib/python3.5/site-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/contrib/cc.py”, line 58, in _linux_shared
raise RuntimeError(msg)
RuntimeError: Compilation error:
/usr/bin/ld: /tmp/tmp37hq10iw/lib.o: relocation R_X86_64_32S against `.rodata.cst16’ can not be used when making a shared object; recompile with -fPIC
/tmp/tmp37hq10iw/lib.o: error adding symbols: Bad value
collect2: error: ld returned 1 exit status

I think i’ve narrowed it down to when export_library is calling save…
self.save(path_obj)
_SaveToFile(…)

I’m fairly certain that’s calling SaveToFile in src/codegen/llvm/llvm_module.cc

Is there something that I’m missing? I’m trying to follow the Get Started with NNVM tutorial but use LLVM instead of CUDA.

wweic · October 17, 2018, 4:42pm

@Oewyn Could you share the script you ran into issue? The error is that you are not compiling shared library with correct options. But default option does contain -fPIC, so need your code to understand why.

Oewyn · October 17, 2018, 5:03pm

I will work on trying to get a simple script that reproduces. In the meantime, I believe it’s the LLVM code that produces the .o file that is the problem. This code (src/codegen/llvm/llvm_module.cc) is invoked when self.save(path_obj) is called in tvm/module.py:108. llvm_module.cc is directly invoking some C calls LLVM to write out the lib.o to a temporary directory before eventually _linux_shared is called.

_linux_shared is indeed setting -fPIC here is the command that’s being sent to g++ to create the .so:

[‘g++’, ‘-shared’, ‘-fPIC’, ‘-o’, ‘tvm_cache/squeezenet/lib.o’, ‘/tmp/tmpuntb7bzr/lib.o’]

/tmp/tmpuntb7bzr/lib.o is the file created with LLVM calls (which i think is the problem because it was not compiled with -fPIC

Oewyn · October 17, 2018, 5:31pm

@wweic Here is the code to reproduce:

import nnvm.compiler
import nnvm.symbol as sym 

x = sym.Variable("x")
y = sym.Variable("y")
z = sym.elemwise_add(x, sym.sqrt(y))
compute_graph = nnvm.graph.create(z)

shape = (4,)
graph, lib, params = nnvm.compiler.build(
    compute_graph, target="llvm", shape={"x": shape}, dtype="float32")

lib_name = "deploy.so"
graph_name = "deploy.json"
params_name = "deploy.params"
lib.export_library(lib_name)
with open(graph_name, "w") as fo: 
    fo.write(graph.json())

wweic · October 17, 2018, 6:31pm

hmm, I tested the code under python 3.6.3 and 3,5.3, both worked fine. What’s the stacktrace you are seeing?

Oewyn · October 17, 2018, 11:32pm

Traceback (most recent call last):
File “a.py”, line 19, in
lib.export_library(lib_name)
File “/home/agsim/virtualenvs/py3/lib/python3.5/site-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/module.py”, line 121, in export_library
fcompile(file_name, files, **kwargs)
File “/home/agsim/virtualenvs/py3/lib/python3.5/site-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/contrib/cc.py”, line 33, in create_shared
_linux_shared(output, objects, options, cc)
File “/home/agsim/virtualenvs/py3/lib/python3.5/site-packages/tvm-0.5.dev0-py3.5-linux-x86_64.egg/tvm/contrib/cc.py”, line 58, in _linux_shared
raise RuntimeError(msg)
RuntimeError: Compilation error:
/usr/bin/ld: /tmp/tmptxduqiul/lib.o: relocation R_X86_64_32S against `.rodata.cst4’ can not be used when making a shared object; recompile with -fPIC
/tmp/tmptxduqiul/lib.o: error adding symbols: Bad value
collect2: error: ld returned 1 exit status

nhynes · October 18, 2018, 1:49pm

If you’re looking for a quick workaround, your can try compiling too .bc and then using clang to go to .o ([example] (https://github.com/dmlc/tvm/blob/master/apps/sgx/enclave/Makefile#L34))

Oewyn · October 18, 2018, 4:30pm

Right now i’m using LLVM IR .ll and lib.save(lib.ll) / which seems to work, although I’m sure it’s slower on the loading side as it has to jit that to asm when loading.

I might look into your suggestion @nhynes, will have to get a recent version of clang installed as the LLVM i’m using is 7.0.0 and clang is complaining that it’s reader is at 3.8.0.