Relay.analysis.alpha_equal: Questions and Potential Issues

I have been working on some tests, and have found a couple potential issues with relay.analysis.alpha_equal. Here are two examples to illustrate what I’m seeing:

  1. This first test creates two functions. They are computationally equivalent, but add_1_fn has an attribute set. When I run alpha_equal, I expect the result to be False. However, the result is different depending on the order of the inputs. When add_1_fn is passed as LHS and add_fn is passed as RHS, the result is True.
from tvm import relay
from tvm.relay.testing import run_opt_pass
import tvm.tir

def test_fn_attribute():
    # create function that performs add
    a = relay.var('a', shape=(10, 10))
    b = relay.var('b', shape=(10, 10))
    add = relay.add(a, b)
    add_fn = relay.Function([a, b], add)
    add_fn = run_opt_pass(add_fn, relay.transform.InferType())

    # create function that performs add with test attribute
    c = relay.var('c', shape=(10, 10))
    d = relay.var('d', shape=(10, 10))
    add_1 = relay.add(c, d)
    add_1_fn = relay.Function([c, d], add_1)
    add_1_fn = add_1_fn.set_attribute("TestAttribute", tvm.tir.StringImm("test"))
    add_1_fn = run_opt_pass(add_1_fn, relay.transform.InferType())

    print(relay.analysis.alpha_equal(add_fn, add_1_fn)) # should be false
    print(relay.analysis.alpha_equal(add_1_fn, add_fn)) # should be false

test_fn_attribute()

Output:

False
True

Relay graphs:

# add_fn
fn (%a: Tensor[(10, 10), float32], %b: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  add(%a, %b) /* ty=Tensor[(10, 10), float32] */
}

# add_1_fn
fn (%c: Tensor[(10, 10), float32], %d: Tensor[(10, 10), float32], TestAttribute="test") -> Tensor[(10, 10), float32] {
  add(%c, %d) /* ty=Tensor[(10, 10), float32] */
}
  1. The second example creates two graphs that are again computationally equivalent, but have different dataflow. There are two key differences in the graphs:

    1. The second function, add_6_fn, contains a redundant add operation. This redundant operation is used in computing the final output.
    2. The final computation in each function, add_2 and add_6, are defined with their relative operands flipped.

    When I run alpha_equal, I expect the result to be False. However, the result is again different depending on the order of the inputs. When add_6_fn is passed as LHS and add_2_fn is passed as RHS, the result is True.

from tvm import relay
from tvm.relay.testing import run_opt_pass
import tvm.tir

def test_dataflow():
    # create function that performs three adds
    a = relay.var('a', shape=(10, 10))
    b = relay.var('b', shape=(10, 10))
    add = relay.add(a, b)
    add_1 = relay.add(a, add)
    add_2 = relay.add(add_1, add)
    add_2_fn = relay.Function([a, b], add_2)
    add_2_fn = run_opt_pass(add_2_fn, relay.transform.InferType())

    # create function that performs four adds but is computationally equivalent
    c = relay.var('c', shape=(10, 10))
    d = relay.var('d', shape=(10, 10))
    add_3 = relay.add(c, d)
    add_4 = relay.add(c, add_3)
    add_5 = relay.add(c, d)
    add_6 = relay.add(add_4, add_5)
    add_6_fn = relay.Function([c, d], add_6)
    add_6_fn = run_opt_pass(add_6_fn, relay.transform.InferType())

    print(relay.analysis.alpha_equal(add_2_fn, add_6_fn)) # should be false
    print(relay.analysis.alpha_equal(add_6_fn, add_2_fn)) # should be false

test_dataflow()

Output:

False
True

Relay graphs:

# add_2_fn
fn (%a: Tensor[(10, 10), float32], %b: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = add(%a, %b) /* ty=Tensor[(10, 10), float32] */;
  %1 = add(%a, %0) /* ty=Tensor[(10, 10), float32] */;
  add(%1, %0) /* ty=Tensor[(10, 10), float32] */
}

# add_6_fn
fn (%c: Tensor[(10, 10), float32], %d: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = add(%c, %d) /* ty=Tensor[(10, 10), float32] */;
  %1 = add(%c, %0) /* ty=Tensor[(10, 10), float32] */;
  %2 = add(%c, %d) /* ty=Tensor[(10, 10), float32] */;
  add(%1, %2) /* ty=Tensor[(10, 10), float32] */
}

Interestingly, when add_2 is defined as relay.add(add, add_1), both outputs are correctly False.

I have a couple of questions based on this:

  1. Is this expected behavior?
  2. What is the correct ordering of inputs for alpha_equal? Should expected be LHS and actual be RHS? From my testing, it seems that this is correct.

This PR has some initial discussion on the issue.

Thanks!

cc @zhiics @tqchen

1 Like

@jonso Thanks for reporting this. I found the bug for the first case. Will spend some time on the second one.

For the first case:

For the second case, I think the problem is because we don’t have a 1 to 1 mapping of lhs and rhs expressions. The case that it could pass is because we have many:1 mapping, i.e. both add(c, d) are mapped to add(a, b). However, we cannot have the 1:many mapping when we flip the args.

I have a local fix, but some other tests failed. Need to spend a bit more time to look into it when I am available.

@zhiics, thanks a lot for the quick fix!

I found another issue, I am not sure if it is related to the issue you described:

I create two graphs that both do two adds. The first graph creates a function to do the add, and uses it twice. The second graph creates two functions to do the adds, and calls them sequentially. alpha_equal returns True when the one with two functions is LHS and the one with one function is RHS.

from tvm import relay
from tvm.relay.testing import run_opt_pass
import tvm.tir

def test_reuse_function():
    # create function that contains another function and re-uses it
    a = relay.var('a', shape=(10, 10))
    b = relay.var('b', shape=(10, 10))

    in_1 = relay.var('in_1', shape=(10, 10))
    in_2 = relay.var('in_2', shape=(10, 10))
    add = relay.add(in_1, in_2)
    add_fn = relay.Function([in_1, in_2], add)

    add_call = relay.Call(add_fn, [a, b])
    add_call_1 = relay.Call(add_fn, [add_call, a])

    add_call_1_fn = relay.Function([a, b], add_call_1)
    add_call_1_fn = run_opt_pass(add_call_1_fn, relay.transform.InferType())

    # create function that contains two functions and is computationally equivalent
    d = relay.var('d', shape=(10, 10))
    e = relay.var('e', shape=(10, 10))

    in_3 = relay.var('in_3', shape=(10, 10))
    in_4 = relay.var('in_4', shape=(10, 10))
    add_3 = relay.add(in_3, in_4)
    add_3_fn = relay.Function([in_3, in_4], add_3)

    in_5 = relay.var('in_5', shape=(10, 10))
    in_6 = relay.var('in_6', shape=(10, 10))
    add_4 = relay.add(in_5, in_6)
    add_4_fn = relay.Function([in_5, in_6], add_4)

    add_call_2 = relay.Call(add_3_fn, [d, e])
    add_call_3 = relay.Call(add_4_fn, [add_call_2, d])

    add_call_3_fn = relay.Function([d, e], add_call_3)
    add_call_3_fn = run_opt_pass(add_call_3_fn, relay.transform.InferType())

    print(relay.analysis.alpha_equal(add_call_1_fn, add_call_3_fn)) # should be false
    print(relay.analysis.alpha_equal(add_call_3_fn, add_call_1_fn)) # should be false

Output:

False
True

Relay graphs:

# add_call_1_fn
fn (%a: Tensor[(10, 10), float32], %b: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = fn (%in_1: Tensor[(10, 10), float32], %in_2: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
    add(%in_1, %in_2) /* ty=Tensor[(10, 10), float32] */
  };
  %1 = %0(%a, %b) /* ty=Tensor[(10, 10), float32] */;
  %0(%1, %a) /* ty=Tensor[(10, 10), float32] */
}

# add_call_3_fn
fn (%d: Tensor[(10, 10), float32], %e: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = fn (%in_3: Tensor[(10, 10), float32], %in_4: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
    add(%in_3, %in_4) /* ty=Tensor[(10, 10), float32] */
  };
  %1 = %0(%d, %e) /* ty=Tensor[(10, 10), float32] */;
  %2 = fn (%in_5: Tensor[(10, 10), float32], %in_6: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
    add(%in_5, %in_6) /* ty=Tensor[(10, 10), float32] */
  };
  %2(%1, %d) /* ty=Tensor[(10, 10), float32] */
}

@jonso yeah, it’s the same thing. You can have %2 and %0 in add_call_3_fn map to %0 in add_call_1_fn, but not vice versa.

@MarisaKirisame @jroesch @tqchen Can you guys take a look as well because alpha_equal is not commutative currently?

Should we map rhs to lhs here and check it similarly to lhs above? This restricts the mapping from lhs and rhs to 1:1.

Or should we also check the equivalence of the expression here:

for example, return it->second.same_as(rhs) || Equal(it->second, rhs); This relaxes the data-flow comparison.

Each of them break a number of tests in the code base. Or do you have any other fixes? Thanks.

There is a limite in which we can do alpha equal. In the case of two functions vs one function with the same value. I feel that it is fine to treat them differently.

Yeah, for functions, I agree that its okay to restrict the mapping. But the same situation happens to all other expressions as well. For example:

x = relay.var("x")

y0 = relay.add(x, x)
z0 = relay.add(y0, y0)

y1 = relay.add(x, x)
z1 = relay.add(y1, y1)
z3 = relay.add(relay.add(x, x), relay.add(x, x))

print(alpha_equal(z3, z0))  # prints True
print(alpha_equal(z0, z3))  # prints False

It really boils down to the semantics of the temoporary graph node, discussed in https://docs.tvm.ai/dev/relay_intro.html

In the case of z3, because z3 have to parallel add expressions which do not reuse the result of add(x,x), whilc z0 does, so they should not be equal to each other due to the fact that we use the dataflow semantics

I see, thanks. So we don’t expect commutativity for alpha equal.

In the above case, it would be great if we can return false for both checks(but it can also be fine as it is), do you know why it returns a different result for different locations?

Yes, I briefly talked about the reason above, it is caused by how we store info to the map here:

For example, if lhs and rhs are: %1 = add(x, y) %2 = add(x, y)

and %1’ = add(x, y)

We can mapping %1 and %2 on lhs to %1` and the rhs, but this will fail if we flip rhs and lhs because we cannot map %1’ to both %1 and %2.

I see, perhaps we need more thought process into the problem. one idea is to build equal_map in both ways(lhs->rhs and rhs->lhs) and only allow the map to happen once.

yeah, that’s something I mentioned above, basically we need to have 1:1 mapping, but some tests fail. I didn’t spend much time to look into them.

This is definitely an error - it must be commutative. @zhiics is right - we need the map to be a multimap instead of only able to have the same key once.

It looks like the 1:1 mapping is too restricting

in test_ir_parser.py , test_seq and test_incomplete_type are failing because

lhs = # let %v_ = (); ()
rhs = # %0 = (); let %v_ = %0; %0

graph_equal(lhs, rhs) # False

the %0 in rhs are ==, but can only map to one of the () in the lhs

@hypercubestart yes, it will not allow data flow analysis. LHS->RHS is many:1 mapping, we somehow need 1:many for RHS->LHS

should equal_map_ be a multimap that maps both lhs -> rhs and rhs -> lhs, in this case, wouldn’t we be expanding equality to potentially structurally inequivalent expressions?

I’m not sure if this relates to the exact same problem, but I’m seeing strange behaviour from alpha_equal:

def test_trivial():
    def fan_out():
        x = relay.var("x", shape=(10, 10))
        a_1 = relay.abs(x)
        a_2 = relay.abs(x)
        out = relay.add(a_1, a_2)
        f = relay.Function([x], out)
        return f

    def normal():
        x = relay.var("x", shape=(10, 10))
        a = relay.abs(x)
        out = relay.add(a, a)
        f = relay.Function([x], out)
        return f

    print(fan_out())
    print(normal())
    print(relay.analysis.alpha_equal(fan_out(), normal()))

Which gives as output:

fn (%x: Tensor[(10, 10), float32]) {
  %0 = abs(%x);
  %1 = abs(%x);
  add(%0, %1)
}
fn (%x: Tensor[(10, 10), float32]) {
  %0 = abs(%x);
  add(%0, %0)
}
True

I’d expect the equality to fail here, but it succeeds. I see it whenever there’s this pattern of multiple identical branches fanning out and then recombining. This can be a problem when testing graph annotation as non-identical graphs pass the tests.

Is this expected behaviour or a bug? @zhiics @tqchen

@mbaret Yes, this is the same problem.