Can TVM accumulate to register without store everytime?


#1

In the reduce operation, especially vectorized op, I want this:

func( a, b):
vec_reg = (0, 0, 0, 0)
for i in 0...N
    vec_reg += load(b[i])
store( a,vec_reg)

however I can only get:

func(a,b):
store(a,(0, 0, 0, 0)
for i in 0...N
    store(a, load(a) + load(b[i])

which means unnecessary load to store every time. Is there any way to solve this? For example, how to declare a vec_reg?


#2

Most low level code generator (LLVM) will rewrite small constant size array to register, so we will get that for free


#3

Thank you, is this also true of compilers such as gcc, nvcc?


#4

Yes, it should be true for most compilers