VTA build and execution error with new 3rdparty vta-hw

tkclimb · April 13, 2020, 6:31am

I tried some tutorials on the latest commit of TVM (cd0d52daa6942bdafa9363ff6cfa3d25fcd5b8d6) but I found the build failure so, I fixed this issue, which might be the one discussed in (VTA build on PYNQ fails at VTA.cmake:97), by the following workaround and then the build succeeded.

diff --git a/cmake/modules/VTA.cmake b/cmake/modules/VTA.cmake
index 4af39e088..b4ef80d4c 100644
--- a/cmake/modules/VTA.cmake
+++ b/cmake/modules/VTA.cmake
@@ -102,6 +103,9 @@ elseif(PYTHON)
     # Target lib: vta
     add_library(vta SHARED ${FPGA_RUNTIME_SRCS})
     target_include_directories(vta PUBLIC vta/include)
+    target_include_directories(vta PUBLIC ${VTA_HW_PATH}/include)
     foreach(__def ${VTA_DEFINITIONS})
       string(SUBSTRING ${__def} 3 -1 __strip_def)
       target_compile_definitions(vta PUBLIC ${__strip_def})

However I failed to run it correctly on the physical devices both pynq and ultra96, because libvta.so doesn’t seem to contain the symbol VTARuntimeShutdown . I pasted the error message below. Is there any one who know about this??

xilinx@pynq:~/tvm$ sudo ./apps/vta_rpc/start_rpc_server.sh
INFO:RPCServer:bind to 0.0.0.0:9091
INFO:RPCServer:connection from ('172.16.19.12', 60962)
INFO:root:Skip reconfig_runtime due to same config.
INFO:root:Program FPGA with 1x16_i8w8a32_15_15_18_17.bit
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
INFO:RPCServer:load_module /tmp/tmpxhytgs15/gemm.o
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so
...
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/xilinx/tvm/python/tvm/rpc/server.py", line 84, in _serve_loop
    base._ServerLoop(sockfd)
  File "/home/xilinx/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 216, in __call__
    raise get_last_ffi_error()
AttributeError: Traceback (most recent call last):
  [bt] (5) /home/xilinx/tvm/build/libtvm_runtime.so(TVMFuncCall+0x70) [0x7fad00e4d8]
  [bt] (4) /home/xilinx/tvm/build/libtvm_runtime.so(+0xadc90) [0x7fad08bc90]
  [bt] (3) /home/xilinx/tvm/build/libtvm_runtime.so(+0xacac8) [0x7fad08aac8]
  [bt] (2) /home/xilinx/tvm/build/libtvm_runtime.so(tvm::runtime::RPCServerLoop(int)+0xac) [0x7fad089dfc]
  [bt] (1) /home/xilinx/tvm/build/libtvm_runtime.so(tvm::runtime::RPCSession::ServerLoop()+0x1cc) [0x7fad07fd8c]
  [bt] (0) /home/xilinx/tvm/build/libtvm_runtime.so(+0x2c750) [0x7fad00a750]
  File "/home/xilinx/tvm/python/tvm/_ffi/_ctypes/packed_func.py", line 78, in cfun
    rv = local_pyfunc(*pyargs)
  File "/home/xilinx/tvm/vta/python/vta/exec/rpc_server.py", line 84, in server_shutdown
    runtime_dll[0].VTARuntimeShutdown()
  File "/usr/lib/python3.6/ctypes/__init__.py", line 361, in __getattr__
    func = self.__getitem__(name)
  File "/usr/lib/python3.6/ctypes/__init__.py", line 366, in __getitem__
    func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /home/xilinx/tvm/vta/python/vta/../../../build/libvta.so: undefined symbol: VTARuntimeShutdown

wjc852456 · April 19, 2020, 1:08pm

I meet the same problem. Have you fixed it yet?

acapone13 · April 27, 2020, 9:48am

I found that VTARuntimeShutdown is defined inside vta/runtime, which before refactoring where defined inside include/vta and vta/src, apparently they are not compiled with VTA now. I tried specifying this drivers to the compilation of VTA by adding the following line to VTA.cmake for the USE_VTA_FPGA section

file(GLOB FPGA_RUNTIME_SRCS vta/runtime/*cc)

Unfortunately I didn’t succeed, execution of tutorials still fails with the same error shown above. Were you able to solve the issue?

wowow11111 · May 12, 2020, 4:18am

I’ve recently updated TVM and encountered same problem. Did anyone solve this problem?

tkclimb · May 12, 2020, 8:42am

you can at least fix the build problem by the solution @acapone13 mentioned but probably will face to the execution problem.

acapone13 · May 19, 2020, 9:27am

WIth the latest update bug fix (commit 63f84a11353791ac3f8916cdcf7c2c6e6d45c4fb) and adding the following patch to VTA.cmake:

--- a/cmake/modules/VTA.cmake
+++ b/cmake/modules/VTA.cmake
@@ -102,7 +102,7 @@ elseif(PYTHON)
     endif()
     # Target lib: vta
     add_library(vta SHARED ${FPGA_RUNTIME_SRCS})
-    target_include_directories(vta PUBLIC vta/include)
+    target_include_directories(vta PUBLIC vta/runtime)
     foreach(__def ${VTA_DEFINITIONS})
       string(SUBSTRING ${__def} 3 -1 __strip_def)
       target_compile_definitions(vta PUBLIC ${__strip_def})

I was able to solve the execution problem, I tested with the test scripts and some of the tutorials and everything works up to know, with the new refactorization runtime.h was not being included into the shared library.

wowow11111 · May 19, 2020, 11:43am

@acapone13
Thanks for letting me know but I reinstalled tvm from the start and face the same issue. I also tried to add the patch to VTA.cmake, by excluding

“”"""“target_include_directories(vta PUBLIC vta/include)”"""""

and including

“”"""“target_include_directories(vta PUBLIC vta/runtime)”"""""

but the same issue appears to me. I’d really be appreciated if you can help me out here. Thank you.

acapone13 · May 19, 2020, 4:37pm

I checked reinstalling TVM in another PYNQ image and I encountered the same issues. My workaround works only if building tvm and vta following the installation guide, and then rebuilding modifying the line I mentioned. I still can’t figure out the problem to provide a proper solution, you can try what I did but that would be a temporary untrusted solution.

wowow11111 · May 19, 2020, 6:45pm

All right, thank you so much anyways.

Guess I’m gonna stick back to the last version that worked well.

thierry · May 20, 2020, 2:23am

@acapone13 you are indeed correct, I tested your change to the CMAKE and along with 63f84a11353791ac3f8916cdcf7c2c6e6d45c4fb it should bring back the Pynq functionality.

Pending on this PR from being merged: https://github.com/apache/incubator-tvm/pull/5630

If you follow these instructions: https://docs.tvm.ai/vta/install.html, (installing on host, and PynqZ1 running pynq image v2.5), you should be able to run the deploy_classification.py without issues.

Here’s the output I got from the experiment:

Reconfigured FPGA and RPC runtime in 3.07s!
Cannot find config for target=ext_dev -device=vta -keys=cpu -model=pynq_1x16_i8w8a32_15_15_18_17, workload=('conv2d_NCHWc.x86', ('TENSOR', (1, 3, 224, 224), 'float32'), ('TENSOR', (64, 3, 7, 7), 'float32'), (2, 2), (3, 3, 3, 3), (1, 1), 'NCHW', 'NCHW', 'float32'). A fallback configuration is used, which may bring great performance regression.
resnet18_v1 inference graph built in 13.35s!
File synset.txt exists, skip.
File cat.png exists, skip.

Performed inference in 371.38ms (std = 0.04) for 1 samples
Average per sample inference time: 371.38ms

resnet18_v1 prediction for sample 0
	#1: tiger cat
	#2: Egyptian cat
	#3: tabby, tabby cat
	#4: lynx, catamount
	#5: weasel

My apologies for the bug, we need better hardware CI/nightlies to catch bugs that aren’t caught in simulation.

tirumalnaidu · November 18, 2020, 8:39am

Traceback (most recent call last): File “/usr/lib/python3.6/multiprocessing/process.py”, line 258, in _bootstrap self.run() File “/usr/lib/python3.6/multiprocessing/process.py”, line 93, in run self._target(*self._args, **self._kwargs) File “/home/root/tvm/python/tvm/rpc/server.py”, line 118, in _serve_loop _ffi_api.ServerLoop(sockfd) File “/home/root/tvm/python/tvm/_ffi/_ctypes/packed_func.py”, line 237, in call raise get_last_ffi_error() AttributeError: Traceback (most recent call last): [bt] (4) /root/tvm/build/libtvm_runtime.so(TVMFuncCall+0x37) [0xb379a4f4] [bt] (3) /root/tvm/build/libtvm_runtime.so(+0x8c1f8) [0xb37f71f8] [bt] (2) /root/tvm/build/libtvm_runtime.so(tvm::runtime::RPCServerLoop(int)+0x6b) [0xb37f6aa0] [bt] (1) /root/tvm/build/libtvm_runtime.so(tvm::runtime::RPCEndpoint::ServerLoop()+0x13f) [0xb37e44f0] [bt] (0) /root/tvm/build/libtvm_runtime.so(+0x2d340) [0xb3798340] File “/home/root/tvm/python/tvm/_ffi/_ctypes/packed_func.py”, line 81, in cfun rv = local_pyfunc(*pyargs) File “/home/root/tvm/vta/python/vta/exec/rpc_server.py”, line 84, in server_shutdown runtime_dll[0].VTARuntimeShutdown() File “/usr/lib/python3.6/ctypes/ init .py”, line 361, in getattr func = self. getitem (name) File “/usr/lib/python3.6/ctypes/ init .py”, line 366, in getitem func = self._FuncPtr((name_or_ordinal, self)) AttributeError: /home/root/tvm/vta/python/vta/…/…/…/build/libvta.so: undefined symbol: VTARuntimeShutdown

I am facing this error while running any examples on DE10 Nano. Isn’t this bug solved?