I was wondering how VTA solves the problem when matrix size is less than VTA tiling size.
By default, the input tiling size is 1 x 16. However, if I want to do a computation with
input matrix dimension = 5 x 5, how does VTA lower such computation in python.
By using padding?
I tried it in vta_get_started.py but I got the following error:
scope local.acc_buffer need to have block=16, shape=[5, 5]