 # How to fuse two compute?

#1

for example, i have two loop

for(1,n) {
C[i] = A[i] + B[i]
}

for(1,n) {
F[i] = D[i] + E[i]
}

how to get

for(1,n) {
C[i] = A[i] + B[i]
F[i] = D[i] + E[i]
}

#2

Try this

``````C, F = tvm.compute(shape, lambda i: (A[i]+B[i], D[i] + E[i]), name='C')
``````

#3

You might also want to look at this documentation.
It shows a similar example as to what you have, but using the compute_at primitive.

I am not 100% sure when its better to do it this way or the other way maybe @lixiaoquan can clarify if I say something wrong.

• I think it has to do if you are defining a new operation and know some statements should be always fused, then you do it like lixiaoquan said.

• Be advised, I am not sure you can split these statements later in TVM.
• I think if you are reusing TOPi operations and want to fuse them, your easiest option would be to use compute_at().

#4

Since C and F are completely independent, I think compute_at won’t work. compute_at can be used when there is a producer-consumer relationship between two compute.

#5

thank you very much, I didn’t know the compute can be a tuple. The doc or tutorial didn’t mention it as well.
tvm.compute

#6

for some hardware it would be useful.
My first idea is compute_at as well, but it doesn’t work for my real code.

#7

Probably because of what masahi said. So basically there is no producer/consumer relationship between both.
Thanks for the question, it made me learn something which I thought I had already understood #8

have you solved the problem?I have the same problem as you

#9

Does TVM have any pass for that？ I think this kind of optimization (merging independent loops) might be helpful for VLIW compilers.

#10

Hello @eqy @yzhliu,

I saw you two have been quite active in the past week and thought I would try a shout out and see if you could help us with this matter.

When we call `C_F=tvm.compute((n,), lambda i: (A[i]+B[i], D[i] + E[i]), name=('C_F'))` the output is a list of schedules (so C_F and C_F).
The thing is that the schedules given by

``````s_cf0 = tvm.create_schedule(C_F.op)
s_cf1 = tvm.create_schedule(C_F.op)
``````

are identical (except for their address in memory).

Weirdly enough when we `print(tvm.lower(s_cf0,[A,B,C,D,E,F],simple_mode=True))`

``````// attr [C_F.v0] storage_scope = "global"
allocate C_F.v0[float32 * 1024]
produce C_F {
for (i, 0, 1024) {
C_F.v0[i] = (A[i] + B[i])
C_F.v0[i] = (D[i] + E[i])
}
}
//The same output print(tvm.lower(s_cf1,[A,B,C,D,E,F],simple_mode=True))
``````

So even inside s_cf0’s schedule there seems to be a notion that 2 variables are been outputted but the naming does not match. I would have expected:

``````produce C_F {
for (i, 0, 1024) {
C_F.v0[i] = (A[i] + B[i])
C_F.v1[i] = (D[i] + E[i])
}
}
``````