Hi all,
In my effort to accelerate AArch64 through tensorization, I incurred into an issue.
Basically, I am padding my input tensor, to let tensorize
work (I need rows to be multiple of 4 and cols to be multiple of 16).
However, bound inference removes padding (since it is not used) and, when I tile the computation, tir.likely
statements appear. This results in tensorize
producing the error:
TVMError: Tensorize failed, split condition tir.likely(((...)) relies on var defined inside tensorize scope
One solution is to add a (sufficiently complex) zero multiplication by a padding element, to trick the bound inference (see for example here).
However, this is very hacky and it is not supposed to last (as the bound inference gets smarter, it might detect that the added element is zero).
The question is: should we try to come up with a “good” solution for this?
One idea might be to let tensorize accept @tir.likely
statements and replace them with a “variable size” tensorization which will be provided by the developer.
For instance, we might add a _intrin_func_variable
private function that gets called only when a variable tensorization (i.e., a tensorization over @tir.likely
) is needed.
I have also read through this post, but it doesn’t seem to arrive to a concrete solution.
Any ideas?