Hi,
I have a ir like below,
ib = tvm.ir_builder.create()
dtype = 'float16'
n = 514
m = 514
_A = tvm.placeholder((n*m,), name = 'A')
Ab = tvm.decl_buffer((n*m,), dtype, name="A")
A = ib.buffer_ptr(Ab)
_B = tvm.placeholder((n*m,), name = 'B')
Bb = tvm.decl_buffer((n*m,), dtype, name="B")
B = ib.buffer_ptr(Bb)
#for i in 0 to n-1:
with ib.for_range(0, 11, name="i") as i:
with ib.for_range(0, 160, name="j") as j:
with ib.if_scope(((i*160) + j) < 1600):
A[(i+1)*m+j+1] = B[(i)*m+j+1] + B[(i+1)*m+j+1] + B[(i+2)*m+j+1]
stmt = ib.get()
and i expect it should be optimized as
with ib.for_range(0, 11, name="i") as i:
with ib.for_range(0, 160, name="j") as j:
with ib.if_scope(((i*160)) < 1600-160):
A[(i+1)*m+j+1] = B[(i)*m+j+1] + B[(i+1)*m+j+1] + B[(i+2)*m+j+1]
is there any code in tvm can do this? I’ve tried simplify, seems it can’t