Several questions about setup VTA-FPGA

PioneerGen · December 28, 2018, 10:22am

1、the program under /tvm/vta is mainly about xilinx FPGA . The question is whether the vta supports the DL workload deploying on Altra Aria De5a-net-i2 or others ?
2、The program generate verilog for FPGA , Can TVM generate OpenCL to deploy DL workload on FPGAs? And if so, are there still the processes of make, build, fire circuit for deploying workload on FPGA?
3、the inference example is based on ResNet, the first conv operation is accelerated on CPU ，and the other on FPGA. The question: Can all the operations workload of MobileNet offload on FPGA?

liangfu · December 28, 2018, 4:06pm

Hi, I’m glad to hear that someone is working on this. There seem to be a huge demand on enabling VTA on top of Intel FPGA. There is a WIP PR and a RFC session for implementation and discussion of such feature.

More specifically,

After I finish debugging the generated hardware proposed in the above PR, it would be possible to enable VTA on any Intel FPGA with on-chip SoC.
Running through OpenCL on Intel FPGA has been already implemented, see AOCL Backend Example
You need to quantize the parameters in MobileNet and optimize the tensorization code for depth-wise convolution, in order to deploy it into FPGA.

thierry · December 28, 2018, 5:32pm

Liangfu is right.

(1.) Altera/Intel FPGA support is WIP using Altera’s equivalent of HLS. There is also a parallel effort to build a Chisel backend that can generate Verilog for any FPGA, but this will probably be ready after Altera support. Driver support is required for both flows which is part of the PR.
(2.) Indeed you can try the AOCL backend that kabata worked on (https://github.com/dmlc/tvm/pull/1474), which works well for single operators at the moment (full network is WIP)
(3.) To add to liangfu’s comment, right now offloading the first layer on the FPGA is tricky since the GEMM unit required at least 16 channels (first layer has only 3). There are some spatial tricks you can use but this adds more processing latency to the point where a CPU remains competitive.

liangfu · December 29, 2018, 12:43am

In addition, I’m also working on enabling compute component in chisel3 to make it available for qsys, see https://github.com/liangfu/chisel-vta/tree/devel/chisel, and the ALU function is almost ready.

PioneerGen · January 4, 2019, 7:39am

but when I install AOCL 17.0 on Unbuntu 16.4 STL by using command “aocl install”, specificly the PICe driver needed to be installed on Ubuntu, some problem occurred. Error below:

According to the official guide , aocl installing is recommended on Centos or Redhat opreration system. And the guide only provides the procedure for Centos. So i don’t know how to go on installing PCIe driver on ubuntu. Can you give some help?