VTA hardware architecture configuration

Hi! I’m trying out VTA on a different board (ZCU104) and managed to run all the tutorial examples. I wish to explore further how the resources can be better utilized with probably a different GEMM design and so.

I learned from the VTA technical report that “The VTA architecture is fully parameterizable: the shape of the GEMM tensor intrinsic can be modified to influence the utilization of hardware resources.”. If I understand it correctly, the shape of GEMM should be configurable through “vta_config.json” with “LOG_BATCH” and “LOG_BLOCK”. I should also be able to change the buffer sizes through the “LOG_UOP/INP/WGT/ACC_BUFF_SIZE”.

However, when I try to change “LOG_BATCH” and “LOG_BLOCK”, I ended up with wrong calculation results for the tutorial scripts. And when I change the buffer sizes, the bitstream generation process will fail.

May I know if there’s a walkthrough guide on how to search the design for different resource usage? Including how to change the number of units in tensor ALU? Am I understanding it correctly or if there’re more things to be done?

Thanks a lot!

Hi, are you still working on this?

I’m going through a similar situation.

Regards,

Jake

1 Like

I was also working on this.

there are some assertion error detections on “test_lib.cc line970~973, operating blocked_gemm_test” method. (this is related to the configurated buffer sizes and the bit widths of vta hw which are defined in hw_spec.h)

If you set the buffer sizes too small, it causes errors, I assume you all know.

If you set them too large, I think the BRAMS can’t afford it or something like that.

IDK exactly what’s going on, but at least I found the combination of what makes it work.

I set the buffer sizes 15/15/17/16 which in default is 15/15/18/17, the csim won’t fail, and don’t cause testing errors.

I’m trying to understand the situation, but maybe if you do, and if you share the knowledge I’d totally be thankful.

best regards, w

1 Like