[SOLVED] Define_split different implementation question


#1

I’ve noticed that with


define split changes. Previously all proposed axis, when values were multiplied, formed desired number e.g. cfg.define_split(“tile_and_bind_split_pad”, 256, num_outputs=3)
axis_0 * axis_1 * axis_2 = 256
gave output:
[64, 4, 1]
[4, 8, 8]
[2, 4, 32]
[1, 8, 32]
[2, 1, 128]
[1, 8, 32]
[2, 64, 2]
[32, 1, 8]
[64, 1, 4]
[2, 32, 4]
[2, 4, 32]
[1, 64, 4]
[16, 8, 2]
[8, 4, 8]
[32, 1, 8]
[4, 4, 16]
[8, 8, 4]

Now, the same one, proposes all divisible factors and not all multiplies to given number. Also, first number in a list from current define_split is always -1.
e.g.
cfg.define_split(“tile_and_bind_split_pad”, 256, num_outputs=4)
gives example output:
[-1, 16, 4, 1]
[-1, 4, 1, 4]
[-1, 1, 64, 1]
[-1, 1, 8, 4]
[-1, 1, 64, 4]
[-1, 2, 8, 1]
[-1, 2, 16, 8]
[-1, 8, 16, 1]
[-1, 16, 2, 1]
[-1, 8, 1, 16]
[-1, 8, 1, 16]
[-1, 4, 1, 64]
[-1, 2, 1, 64]
[-1, 2, 1, 4]
[-1, 1, 8, 1]

@comaniac, why is -1 added at the beginning of the list?
How by using current implementation, could I check only candidates that multiply up to given number?

Thanks in advance.


#2

-1 is just a more general representation. In AutoTVM, you can set only one split factor to -1 and it means the rest divisible number. For example [-1, 16, 4, 1] is equivalent to [4, 16, 4, 1] when the length is 256.

Since the PR I filed was to add a new optional policy that adds power-of-two numbers to the candidate list, the original policy that uses divisible numbers is still the default setting. It means you can still use define_split as always to get the same result.

Here is an example:

from tvm.autotvm.task import space
class HackAxis:
  def __init__(self, length):
    self.length = length
axis = [HackAxis(256)]

s = space.SplitSpace(axis, 'factors', num_outputs=3)
print (len(s.entities)) # >>> 45

When output number is 3 and we always compute the first number by 256 / axis_1 / axis_2, the choices of axis_1 and axis_2 are [1, 2, …, 256, 1], [1, 2, …, 128, 2], …, [1, 256]. As a result, the total is 9 + 8 + 7, …+ 1 = 45.


#3

Thank you for the answer :slight_smile:
If -1 means the rest divisible number, then, when no_tail equals False and I’ll get e.g. [-1, 16, 256, 2] (for axis 256), what does -1 mean in that case?


#4

Hmm I think [-1, 16, 256, 2] for axis 256 will make that -1 to be just 1, but it looks more like a bug because it could cause potential issues for codegen. I previously assumed no_tail will be True only when policy=“power2” is used. I better refine the logic for it. Thanks for pointing out!


#5

PR Merged