Creating a custom vta bitstream for PYNQ Z1

Hi, I followed the tensorflow tutorial and VTA installation tutorial in TVM tutorials. According to my knowledge, in VTA tutorials there are only pre-compiled bit streams which will be downloaded from a github repository by default. I want to know whether there is any step by step guide to generate a custom bitstream from a tensorflow or any other machine learning model which can be then run in PYNQ FPGA.

Hi @thilinawee ,

Following are the vta bitstream build steps mentioned in VTA installation tutorial , VTA bit stream not coupled with any Frontend , it leave control flow in cpu and by provide GEMM/Elewise operator FPGA implementation to help accelerate algorithm.


Bitstream generation is driven by a top-level Makefile under <tvm root>/vta/hardware/xilinx/ .

If you just want to simulate the VTA design in software emulation to make sure that it is functional, enter:

cd /vta/hardware/xilinx make ip MODE=sim

If you just want to generate the HLS-based VTA IP cores without launching the entire design place and route, enter:

make ip

You’ll be able to view the HLS synthesis reports under <tvm root>/vta/build/hardware/xilinx/hls/ <configuration>/<block>/solution0/syn/report/<block>_csynth.rpt

> Note: The <configuration> name is a string that summarizes the VTA configuration parameters listed in the vta_config.json . The <block> name refers to the specific module (or HLS function) that compose the high-level VTA pipeline.

Finally to run the full hardware compilation and generate the VTA bitstream, run:


1 Like

Hi @hjiang,
Thank you so much for your reply. I had slightly misunderstood that we can directly generate a bitstream from any ML model using TVM. Now I understand that this VTA bitstream is a specific design on the FPGA which can accelerate the ML implementation by converting that into a set of instructions which can be run on that FPGA design. I am currently following this article to learn more about VTA

@hjiang - thanks for the reply! Indeed you can think of VTA as a fixed architecture which you can tweak parameters for. But to take a model and run it on VTA you essentially need to produce an “executable” of a program that will run inference on the model on VTA (just like how we program CPUs and GPUs).

1 Like