# Strange expansion of tensors' bounds

#1

Consider the following code:

``````import tvm

def show_lowered(outputs, inputs):
sout = tvm.create_schedule([o.op for o in outputs])
mout = tvm.lower(sout, outputs + inputs, simple_mode=True)
print(mout)

A = tvm.compute((10, 2), lambda i, j: i + j)
show_lowered([A], [])
B = tvm.compute((10,), lambda i: A[i, i + (0 - i)])
show_lowered([B], [])
``````

For A it produces the following lowered code, as expected:

``````produce compute {
for (i, 0, 10) {
for (j, 0, 2) {
compute[((i*2) + j)] = (i + j)
}
}
}
``````

But for B it produces the following code, expanding the size of A:

``````// attr [compute] storage_scope = "global"
allocate compute[int32 * 10 * 19]
produce compute {
for (i, 0, 10) {
for (j, 0, 19) {
if (likely((9 <= j))) {
if (likely((j < 11))) {
compute[((i*19) + j)] = ((i + j) + -9)
}
}
}
}
}
produce compute {
for (i, 0, 10) {
compute[i] = compute[((i*19) + 9)]
}
}
``````

I guess, TVM don’t simplify `i + (0 - i)`, so it infers the bounds for the second dimension of A too conservatively. My question is: why does it change the size of A at all? I think, adding a check or an assertion inside of B would be a better solution than expanding A, but maybe I’m missing something?

#2

I was able to reproduce this error on recent master c48812d

Edit
Here we see that TVM applies some memory optimization algorithm. While I can’t comment on this particular algorithm, I see that it is applied without any scheduling instruction from DSL. I think it may be a sign of higher-order design problem. I expected the default code to be of a much simpler form like

``````allocate A[int32 * 10 * 2]
allocate B[int32 * 10]
produce A { ... }
produce B { ... }
``````

#3