[VTA] Failed to run metal test

Hi Community,

I try run metal test (tvm/vta/test/hardware/metal_tests), but it failed as below:

INFO - ALU test of max imm: batch=16, vector_size=128, uop_compression=1
INFO - Synchronization time: 0.067ms
INFO - Throughput: 0.031GOps/s
INFO - ALU test failed, got 128 errors!

INFO - ALU test of max imm: batch=16, vector_size=128, uop_compression=0
INFO - Synchronization time: 0.064ms
INFO - Throughput: 0.032GOps/s
INFO - ALU test successful!

INFO - ALU test of add imm: batch=16, vector_size=128, uop_compression=1
INFO - Synchronization time: 0.064ms
INFO - Throughput: 0.032GOps/s
INFO - ALU test successful!

INFO - ALU test of add imm: batch=16, vector_size=128, uop_compression=0
INFO - Synchronization time: 0.064ms
INFO - Throughput: 0.032GOps/s
INFO - ALU test successful!

INFO - ALU test of shr: batch=16, vector_size=128, uop_compression=1
INFO - Synchronization time: 0.064ms
INFO - Throughput: 0.032GOps/s
INFO - ALU test successful!

INFO - ALU test of shr: batch=16, vector_size=128, uop_compression=0
INFO - Synchronization time: 0.063ms
INFO - Throughput: 0.032GOps/s
INFO - ALU test successful!

INFO - ALU test of max: batch=16, vector_size=128, uop_compression=1
INFO - Synchronization time: 0.078ms
INFO - Throughput: 0.026GOps/s
INFO - ALU test failed, got 127 errors!

INFO - ALU test of max: batch=16, vector_size=128, uop_compression=0
INFO - Synchronization time: 0.073ms
INFO - Throughput: 0.028GOps/s
INFO - ALU test successful!

INFO - ALU test of add: batch=16, vector_size=128, uop_compression=1
INFO - Synchronization time: 0.075ms
INFO - Throughput: 0.027GOps/s
INFO - ALU test successful!

INFO - ALU test of add: batch=16, vector_size=128, uop_compression=0
INFO - Synchronization time: 0.073ms
INFO - Throughput: 0.028GOps/s
INFO - ALU test successful!

INFO - Blocked GEMM test: batch=256, channels=256, block=64, uop_comp=1, vt=2
INFO - Synchronization time: 3.283ms
INFO - Throughput: 10.222GOPs/s
INFO - Blocked GEMM test failed, got 4212 errors!

INFO - Blocked GEMM test: batch=256, channels=256, block=64, uop_comp=0, vt=2
INFO - Synchronization time: 3.298ms
INFO - Throughput: 10.175GOPs/s
INFO - Blocked GEMM test failed, got 65277 errors!

INFO - Blocked GEMM test: batch=256, channels=256, block=64, uop_comp=1, vt=1
INFO - Synchronization time: 4.884ms
INFO - Throughput: 6.871GOPs/s
INFO - Blocked GEMM test failed, got 65289 errors!

INFO - Blocked GEMM test: batch=256, channels=256, block=64, uop_comp=0, vt=1
INFO - Synchronization time: 4.704ms
INFO - Throughput: 7.133GOPs/s
INFO - Blocked GEMM test successful!

INTO - Unit tests failed!

I built vta with this configuration:
{
“TARGET” : “pynq”,
“HW_FREQ” : 142,
“HW_CLK_TARGET” : 6,
“HW_VER” : “0.0.0”,
“LOG_INP_WIDTH” : 3,
“LOG_WGT_WIDTH” : 3,
“LOG_ACC_WIDTH” : 5,
“LOG_OUT_WIDTH” : 3,
“LOG_BATCH” : 0,
“LOG_BLOCK_IN” : 4,
“LOG_BLOCK_OUT” : 4,
“LOG_UOP_BUFF_SIZE” : 15,
“LOG_INP_BUFF_SIZE” : 15,
“LOG_WGT_BUFF_SIZE” : 18,
“LOG_ACC_BUFF_SIZE” : 17
}

Board is PYNQ-Z1

Can anyone help to give direction to me for fix (in order to pass metal test)?

The metal tests are deprecated and are being re-hauled; this is not a sign that your design is failing. Have you tried running the python unit tests instead?

It still be fail with running python unit test, log as below:
1> Host side
Traceback (most recent call last):
File “test_vta_insn.py”, line 548, in
test_save_load_out()
File “test_vta_insn.py”, line 82, in test_save_load_out
vta.testing.run(_run)
File “/home/alex/tvm/vta/python/vta/testing/util.py”, line 73, in run
run_func(env, remote)
File “test_vta_insn.py”, line 75, in _run
f(x_nd, y_nd)
File “/home/alex/tvm/python/tvm/_ffi/function.py”, line 153, in call
return f(args)
File “/home/alex/tvm/python/tvm/_ffi/_ctypes/function.py”, line 209, in call
raise get_last_ffi_error()
tvm._ffi.base.TVMError: Traceback (most recent call last):
[bt] (8) /home/xilinx/tvm/build/libtvm_runtime.so(tvm::runtime::RPCSession::HandleUntilReturnEvent(tvm::runtime::TVMRetValue
, bool, tvm::runtime::PackedFunc const*)+0x103) [0xb595df50]
[bt] (7) /home/xilinx/tvm/build/libtvm_runtime.so(tvm::runtime::RPCSession::EventHandler::HandleNextEvent(tvm::runtime::TVMRetValue*, bool, tvm::runtime::PackedFunc const*)+0x203) [0xb5961d50]
[bt] (6) /home/xilinx/tvm/build/libtvm_runtime.so(tvm::runtime::RPCSession::EventHandler::HandleRecvPackedSeqArg()+0x23f) [0xb5961860]
[bt] (5) /home/xilinx/tvm/build/libtvm_runtime.so(tvm::runtime::RPCSession::EventHandler::SwitchToState(tvm::runtime::RPCSession::EventHandler::State)+0x1eb) [0xb5960b94]
[bt] (4) /home/xilinx/tvm/build/libtvm_runtime.so(tvm::runtime::RPCSession::EventHandler::HandlePackedCall()+0x417) [0xb595ca80]
[bt] (3) /home/xilinx/tvm/build/libtvm_runtime.so(+0x372f2) [0xb59362f2]
[bt] (2) /tmp/tmpte7_n2gz/load_act.o.so(default_function+0x18) [0xb560e820]
[bt] (1) /home/xilinx/tvm/vta/python/vta/…/…/…/build/libvta.so(VTATLSCommandHandle+0xc5) [0xb5641572]
[bt] (0) /home/xilinx/tvm/vta/python/vta/…/…/…/build/libvta.so(std::__shared_ptr<vta::CommandQueue, (__gnu_cxx::_Lock_policy)2>::_shared_ptr<std::allocatorvta::CommandQueue>(std::Sp_make_shared_tag, std::allocatorvta::CommandQueue const&)+0x42d) [0xb56460b6]
TVMError: Except caught from RPC call: [05:31:09] /home/xilinx/tvm/vta/src/runtime.cc:319: Check failed: dram_buffer
!= nullptr:
2> Pynq board side
INFO:RPCServer:connection from (‘192.168.2.1’, 54380)
INFO:root:Loading VTA library: /home/xilinx/tvm/vta/python/vta/…/…/…/build/libvta.so
INFO:RPCServer:load_module /tmp/tmpte7_n2gz/load_act.o
terminate called after throwing an instance of ‘dmlc::Error’
what(): [05:31:09] /home/xilinx/tvm/vta/src/runtime.cc:319: Check failed: dram_buffer
!= nullptr:
Stack trace:
[bt] (0) /home/xilinx/tvm/vta/python/vta/…/…/…/build/libvta.so(std::__shared_ptr<vta::CommandQueue, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocatorvta::CommandQueue>(std::_Sp_make_shared_tag, std::allocatorvta::CommandQueue const&)+0x42d) [0xb56460b6]
[bt] (1) /home/xilinx/tvm/vta/python/vta/…/…/…/build/libvta.so(VTARuntimeShutdown+0x175) [0xb56413ea]
[bt] (2) /usr/lib/arm-linux-gnueabihf/libffi.so.6(ffi_call_VFP+0x54) [0xb687ab54]
[bt] (3) /usr/lib/arm-linux-gnueabihf/libffi.so.6(ffi_call+0xdb) [0xb687b1f0]

But when i build VTA, it show up the out of DSP resource and critical timing warning
vta1

Hi,

Im facing issue with GEMM test cases. Pls find the below test result with parameters:

===================================================================== Size of VTAInsn: 16 Size of VTAUop: 4 VTA_UOP_BUFF_DEPTH: 2048 VTA_LOG_UOP_BUFF_DEPTH: 11 VTA_WGT_BUFF_DEPTH: 512 VTA_LOG_WGT_BUFF_DEPTH: 9 VTA_INP_BUFF_DEPTH: 2048 VTA_LOG_INP_BUFF_DEPTH: 11 VTA_ACC_BUFF_DEPTH: 1024 VTA_LOG_ACC_BUFF_DEPTH: 10 VTA_WGT_WORDS: 131072 VTA_INP_WORDS: 32768 VTA_ACC_WORDS: 16384 VTA_INS_ELEM_BYTES: 16 VTA_UOP_ELEM_BYTES: 16 VTA_INP_ELEM_BYTES: 16 VTA_WGT_ELEM_BYTES: 256 VTA_ACC_ELEM_BYTES: 64 VTA_BLOCK_IN: 16 VTA_BLOCK_OUT: 16

INFO - Blocked GEMM test: batch=4, in_channels=64, out_channels=64, uop_comp=0 INFO - Blocked GEMM test failed, got 207 errors!

Thanks, Siva