I’m attempting to implement a 32x32 matrix-vector-multiplier (MVM), written in TensorFlow, on the VTA back-end.
I’m expecting to see the design get decomposed into 4 16x16 MVM operations, since the default configuration of the VTA uses a 16-element accumulator (and I haven’t fiddled with it).
However, I’m seeing the graph before and after the call to vta.graph.pack()
remain unchanged:
Graph(%W, %x) {
%1 = transpose(%W, axes='(0, 1)')
%2 = reshape(%1, shape='(32, 32)')
%4 = transpose(%x, axes='(0,)')
%5 = reshape(%4, shape='(32, 1)')
%6 = transpose(%5, axes='(1, 0)')
%7 = dense(%2, %6, use_bias='False', units='1')
%8 = reshape(%7, shape='(32,)')
ret %8
}
Graph(%W, %x) {
%1 = transpose(%W, axes='(0, 1)')
%2 = reshape(%1, shape='(32, 32)')
%4 = transpose(%x, axes='(0,)')
%5 = reshape(%4, shape='(32, 1)')
%6 = transpose(%5, axes='(1, 0)')
%7 = dense(%2, %6, use_bias='False', units='1')
%8 = reshape(%7, shape='(32,)')
ret %8
}
graph_attr_keys = [shape_num_unknown_nodes, shape]
I was expecting to see “16” dimensions in the second graph above.
Can anyone help me understand what’s going on here?
Thanks!