Loop partition for variables

currently, loop partition has been enhanced and it can support eliminate ‘likely if’ conditions, but it is only for constant. But for variable, the ‘likely if’ still can’t be eliminated completely.

import tvm

def loop_partition():
  n = tvm.var("n")
  A = tvm.placeholder((n, ), name='A')
  B = tvm.placeholder((n, ), name='B')
  T = tvm.compute((n, ), lambda i: A[i]+B[i], name='T')

  s = tvm.create_schedule(T.op)

  xo, xi = s[T].split(T.op.axis[0], 16)

  bounds = tvm.schedule.InferBound(s)
  stmt = tvm.schedule.ScheduleOps(s, bounds)
  print(stmt)
  stmt = tvm.ir_pass.LoopPartition(stmt, True)
  stmt = tvm.ir_pass.Simplify(stmt)
  print(stmt)

loop_partition()

it print like this: in the second loop, it still has ‘if’, but sometimes we want to eliminate this ‘if’ too.

// attr [compute(T, 0x1a154f0)] realize_scope = ""
realize T([0, n]) {
  produce T {
    for (i.outer, 0, (n/16)) {
      for (i.inner, 0, 16) {
        T((i.inner + (i.outer*16))) =(A((i.inner + (i.outer*16))) + B((i.inner + (i.outer*16))))
      }
    }
    for (i.outer, 0, (((n % 16) + 15)/16)) {
      for (i.inner, 0, 16) {
        if ((i.inner < (n - ((i.outer + (n/16))*16)))) {
          T((i.inner + ((i.outer + (n/16))*16))) =(A((i.inner + ((i.outer + (n/16))*16))) + B((i.inner + ((i.outer + (n/16))*16))))
        }
      }
    }
  }
}

@tqchen, do you have plan to support this feature?

i.inner < (n - ((i.outer + (n/16))*16 ) --> i.inner < -i.outer * 16

seems it’s a bug

There are two cases:

  1. if n% 16 == 0, then i.outer’s extent is ((n%16) + 15)/16 = 15/16 = 0. Therefore, the whole i.outer loop will be empty.
  2. if n%16 != 0, then i.outer’s extent will be 1. Therefore, i.outer can only be 0, and the condition will be simplified to if (i.inner < n - n/16*16) which is equivalent to if(i.inner < n % 16).

Considering the above two cases, we can at least simplify the second loopnest to

    if (likely( n%16 != 0)) {
      for (i.inner, 0, 16) {
        if (i.inner < (n%16)) {
          T((i.inner + ((n/16)*16))) =(A(i.inner + (n/16)*16) + B((i.inner + (n/16)*16)))
        }
      }
    }

Now it’s easier to remove the inner if, because the i.inner loop can be simplified to for (i.inner, 0, min(n%16,16)) which is equivalent to for(i.inner, 0, n%16).