Hi,
we have a case like this,
def test_add(need_build=True, need_ir=True):
#n = tvm.var("n")
n = 1024
A = tvm.placeholder((n,), name='A')
B = tvm.placeholder((n,), name='B')
C = tvm.placeholder((n,), name='C')
print(A)
T = tvm.compute(A.shape, lambda i: A[i] + B[i] + C[i], name="T")
R = tvm.compute(A.shape, lambda i: T[i] + A[i] + B[i] + C[i], name="R")
s = tvm.create_schedule([R.op])
# We only test Poly pass.
if need_build:
with build_config:
fadd = tvm.build(s, [A, B, C, R], "cce", name="myadd")
#fadd = tvm.build(s, [A, B, T], "cce", name="myadd")
source_code = fadd.imported_modules[0].get_source()
util.create_cce(kernel_name, "./", source_code)
print("Succeed")
if __name__ == "__main__":
test_add()
the output of scheduleOps is like
// attr [compute(T, 0x2b83d20)] realize_scope = ""
realize T([0, 1024]) {
produce T {
for (i, 0, 1024) {
T(i) =((A(i) + B(i)) + C(i))
}
}
// attr [compute(R, 0x26ae6b0)] realize_scope = ""
realize R([0, 1024]) {
produce R {
for (i, 0, 1024) {
R(i) =((T(i) + A(i) + B(i)) + C(i))
}
}
}
}
we print the address of hailde call in SchedulePostProc, seems for the same Halide call, tvm doesn’t share the object, my question is, why don’t we share the object? it’s designed in this way, or just didn’t consider yet?
why we care about this is because we are trying to generate cache read/write with poly, if sharing objects, it’s easier to do this.
Tensor(shape=[1024], op.name=A)
call e A(i) addr 0x214fe40
call e B(i) addr 0x214fec0
call e C(i) addr 0x214ff40
call e T(i) addr 0x2153b20
call e A(i) addr 0x2153ba0
call e B(i) addr 0x1c5ea60
call e C(i) addr 0x1c5eae0