Matrix size less than VTA block size


I was wondering how VTA solves the problem when matrix size is less than VTA tiling size.

By default, the input tiling size is 1 x 16. However, if I want to do a computation with

input matrix dimension = 5 x 5, how does VTA lower such computation in python.

By using padding?

I tried it in but I got the following error:

scope local.acc_buffer need to have block=16, shape=[5, 5]



Yes, right now that’s a restriction of VTA. That being said, we may be able to add a relay pass that reshapes operators to be of shapes that are multiples of 16, and would automatically zero-pad. It might not be the most efficient thing to do for small shapes, but would ensure that we provide adequate support for all shapes.


Thanks for your answer :slightly_smiling_face: