Relay.analysis.alpha_equal: Questions and Potential Issues

jonso · February 16, 2020, 5:54am

I have been working on some tests, and have found a couple potential issues with relay.analysis.alpha_equal. Here are two examples to illustrate what I’m seeing:

This first test creates two functions. They are computationally equivalent, but add_1_fn has an attribute set. When I run alpha_equal, I expect the result to be False. However, the result is different depending on the order of the inputs. When add_1_fn is passed as LHS and add_fn is passed as RHS, the result is True.

from tvm import relay
from tvm.relay.testing import run_opt_pass
import tvm.tir

def test_fn_attribute():
    # create function that performs add
    a = relay.var('a', shape=(10, 10))
    b = relay.var('b', shape=(10, 10))
    add = relay.add(a, b)
    add_fn = relay.Function([a, b], add)
    add_fn = run_opt_pass(add_fn, relay.transform.InferType())

    # create function that performs add with test attribute
    c = relay.var('c', shape=(10, 10))
    d = relay.var('d', shape=(10, 10))
    add_1 = relay.add(c, d)
    add_1_fn = relay.Function([c, d], add_1)
    add_1_fn = add_1_fn.set_attribute("TestAttribute", tvm.tir.StringImm("test"))
    add_1_fn = run_opt_pass(add_1_fn, relay.transform.InferType())

    print(relay.analysis.alpha_equal(add_fn, add_1_fn)) # should be false
    print(relay.analysis.alpha_equal(add_1_fn, add_fn)) # should be false

test_fn_attribute()

Output:

False
True

Relay graphs:

# add_fn
fn (%a: Tensor[(10, 10), float32], %b: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  add(%a, %b) /* ty=Tensor[(10, 10), float32] */
}

# add_1_fn
fn (%c: Tensor[(10, 10), float32], %d: Tensor[(10, 10), float32], TestAttribute="test") -> Tensor[(10, 10), float32] {
  add(%c, %d) /* ty=Tensor[(10, 10), float32] */
}

The second example creates two graphs that are again computationally equivalent, but have different dataflow. There are two key differences in the graphs:
1. The second function, add_6_fn, contains a redundant add operation. This redundant operation is used in computing the final output.
2. The final computation in each function, add_2 and add_6, are defined with their relative operands flipped.
When I run alpha_equal, I expect the result to be False. However, the result is again different depending on the order of the inputs. When add_6_fn is passed as LHS and add_2_fn is passed as RHS, the result is True.

from tvm import relay
from tvm.relay.testing import run_opt_pass
import tvm.tir

def test_dataflow():
    # create function that performs three adds
    a = relay.var('a', shape=(10, 10))
    b = relay.var('b', shape=(10, 10))
    add = relay.add(a, b)
    add_1 = relay.add(a, add)
    add_2 = relay.add(add_1, add)
    add_2_fn = relay.Function([a, b], add_2)
    add_2_fn = run_opt_pass(add_2_fn, relay.transform.InferType())

    # create function that performs four adds but is computationally equivalent
    c = relay.var('c', shape=(10, 10))
    d = relay.var('d', shape=(10, 10))
    add_3 = relay.add(c, d)
    add_4 = relay.add(c, add_3)
    add_5 = relay.add(c, d)
    add_6 = relay.add(add_4, add_5)
    add_6_fn = relay.Function([c, d], add_6)
    add_6_fn = run_opt_pass(add_6_fn, relay.transform.InferType())

    print(relay.analysis.alpha_equal(add_2_fn, add_6_fn)) # should be false
    print(relay.analysis.alpha_equal(add_6_fn, add_2_fn)) # should be false

test_dataflow()

Output:

False
True

Relay graphs:

# add_2_fn
fn (%a: Tensor[(10, 10), float32], %b: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = add(%a, %b) /* ty=Tensor[(10, 10), float32] */;
  %1 = add(%a, %0) /* ty=Tensor[(10, 10), float32] */;
  add(%1, %0) /* ty=Tensor[(10, 10), float32] */
}

# add_6_fn
fn (%c: Tensor[(10, 10), float32], %d: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = add(%c, %d) /* ty=Tensor[(10, 10), float32] */;
  %1 = add(%c, %0) /* ty=Tensor[(10, 10), float32] */;
  %2 = add(%c, %d) /* ty=Tensor[(10, 10), float32] */;
  add(%1, %2) /* ty=Tensor[(10, 10), float32] */
}

Interestingly, when add_2 is defined as relay.add(add, add_1), both outputs are correctly False.

I have a couple of questions based on this:

Is this expected behavior?
What is the correct ordering of inputs for alpha_equal? Should expected be LHS and actual be RHS? From my testing, it seems that this is correct.

This PR has some initial discussion on the issue.

Thanks!

cc @zhiics @tqchen

zhiics · February 16, 2020, 6:40am

@jonso Thanks for reporting this. I found the bug for the first case. Will spend some time on the second one.

zhiics · February 16, 2020, 8:36pm

For the first case:

For the second case, I think the problem is because we don’t have a 1 to 1 mapping of lhs and rhs expressions. The case that it could pass is because we have many:1 mapping, i.e. both add(c, d) are mapped to add(a, b). However, we cannot have the 1:many mapping when we flip the args.

I have a local fix, but some other tests failed. Need to spend a bit more time to look into it when I am available.

jonso · February 17, 2020, 4:45pm

@zhiics, thanks a lot for the quick fix!

I found another issue, I am not sure if it is related to the issue you described:

I create two graphs that both do two adds. The first graph creates a function to do the add, and uses it twice. The second graph creates two functions to do the adds, and calls them sequentially. alpha_equal returns True when the one with two functions is LHS and the one with one function is RHS.

from tvm import relay
from tvm.relay.testing import run_opt_pass
import tvm.tir

def test_reuse_function():
    # create function that contains another function and re-uses it
    a = relay.var('a', shape=(10, 10))
    b = relay.var('b', shape=(10, 10))

    in_1 = relay.var('in_1', shape=(10, 10))
    in_2 = relay.var('in_2', shape=(10, 10))
    add = relay.add(in_1, in_2)
    add_fn = relay.Function([in_1, in_2], add)

    add_call = relay.Call(add_fn, [a, b])
    add_call_1 = relay.Call(add_fn, [add_call, a])

    add_call_1_fn = relay.Function([a, b], add_call_1)
    add_call_1_fn = run_opt_pass(add_call_1_fn, relay.transform.InferType())

    # create function that contains two functions and is computationally equivalent
    d = relay.var('d', shape=(10, 10))
    e = relay.var('e', shape=(10, 10))

    in_3 = relay.var('in_3', shape=(10, 10))
    in_4 = relay.var('in_4', shape=(10, 10))
    add_3 = relay.add(in_3, in_4)
    add_3_fn = relay.Function([in_3, in_4], add_3)

    in_5 = relay.var('in_5', shape=(10, 10))
    in_6 = relay.var('in_6', shape=(10, 10))
    add_4 = relay.add(in_5, in_6)
    add_4_fn = relay.Function([in_5, in_6], add_4)

    add_call_2 = relay.Call(add_3_fn, [d, e])
    add_call_3 = relay.Call(add_4_fn, [add_call_2, d])

    add_call_3_fn = relay.Function([d, e], add_call_3)
    add_call_3_fn = run_opt_pass(add_call_3_fn, relay.transform.InferType())

    print(relay.analysis.alpha_equal(add_call_1_fn, add_call_3_fn)) # should be false
    print(relay.analysis.alpha_equal(add_call_3_fn, add_call_1_fn)) # should be false

Output:

False
True

Relay graphs:

# add_call_1_fn
fn (%a: Tensor[(10, 10), float32], %b: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = fn (%in_1: Tensor[(10, 10), float32], %in_2: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
    add(%in_1, %in_2) /* ty=Tensor[(10, 10), float32] */
  };
  %1 = %0(%a, %b) /* ty=Tensor[(10, 10), float32] */;
  %0(%1, %a) /* ty=Tensor[(10, 10), float32] */
}

# add_call_3_fn
fn (%d: Tensor[(10, 10), float32], %e: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = fn (%in_3: Tensor[(10, 10), float32], %in_4: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
    add(%in_3, %in_4) /* ty=Tensor[(10, 10), float32] */
  };
  %1 = %0(%d, %e) /* ty=Tensor[(10, 10), float32] */;
  %2 = fn (%in_5: Tensor[(10, 10), float32], %in_6: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
    add(%in_5, %in_6) /* ty=Tensor[(10, 10), float32] */
  };
  %2(%1, %d) /* ty=Tensor[(10, 10), float32] */
}

zhiics · February 17, 2020, 5:05pm

@jonso yeah, it’s the same thing. You can have %2 and %0 in add_call_3_fn map to %0 in add_call_1_fn, but not vice versa.

zhiics · February 17, 2020, 5:31pm

@MarisaKirisame @jroesch @tqchen Can you guys take a look as well because alpha_equal is not commutative currently?

Should we map rhs to lhs here and check it similarly to lhs above? This restricts the mapping from lhs and rhs to 1:1.

github.com

apache/incubator-tvm/blob/13140916eeb6fc33b962f3faf9dbe6b702057865/src/relay/ir/alpha_equal.cc#L157


  */
 bool ExprEqual(const Expr& lhs, const Expr& rhs) {
   auto compute = [&]() {
     if (lhs.same_as(rhs)) return true;
     if (!lhs.defined() || !rhs.defined()) return false;
     auto it = equal_map_.find(lhs);
     if (it != equal_map_.end()) {
       return it->second.same_as(rhs);
     }
     if (this->VisitExpr(lhs, rhs)) {
       equal_map_[lhs] = rhs;
       return true;
     } else {
       return false;
     }
   };
   return Compare(compute(), lhs, rhs);
 }


protected:
 /*!

Or should we also check the equivalence of the expression here:

github.com

apache/incubator-tvm/blob/master/src/relay/ir/alpha_equal.cc#L187


 * \brief Check Equality of leaf node of the graph.
 *  if map_free_var_ is set to true, try to map via equal node.
 * \param lhs The left hand operand.
 * \param rhs The right hand operand.
 * \return The compare result.
 */
bool LeafObjectEqual(const ObjectRef& lhs, const ObjectRef& rhs) {
  if (lhs.same_as(rhs)) return true;
  auto it = equal_map_.find(lhs);
  if (it != equal_map_.end()) {
    return it->second.same_as(rhs);
  } else {
    if (map_free_var_) {
      if (lhs->type_index() != rhs->type_index()) return false;
      equal_map_[lhs] = rhs;
      return true;
    } else {
      return false;
    }
  }
}

for example, return it->second.same_as(rhs) || Equal(it->second, rhs); This relaxes the data-flow comparison.

Each of them break a number of tests in the code base. Or do you have any other fixes? Thanks.

tqchen · February 17, 2020, 6:23pm

There is a limite in which we can do alpha equal. In the case of two functions vs one function with the same value. I feel that it is fine to treat them differently.

zhiics · February 17, 2020, 6:44pm

Yeah, for functions, I agree that its okay to restrict the mapping. But the same situation happens to all other expressions as well. For example:

x = relay.var("x")

y0 = relay.add(x, x)
z0 = relay.add(y0, y0)

y1 = relay.add(x, x)
z1 = relay.add(y1, y1)
z3 = relay.add(relay.add(x, x), relay.add(x, x))

print(alpha_equal(z3, z0))  # prints True
print(alpha_equal(z0, z3))  # prints False

tqchen · February 17, 2020, 7:33pm

It really boils down to the semantics of the temoporary graph node, discussed in https://docs.tvm.ai/dev/relay_intro.html

In the case of z3, because z3 have to parallel add expressions which do not reuse the result of add(x,x), whilc z0 does, so they should not be equal to each other due to the fact that we use the dataflow semantics

zhiics · February 17, 2020, 8:17pm

I see, thanks. So we don’t expect commutativity for alpha equal.

tqchen · February 17, 2020, 8:54pm

In the above case, it would be great if we can return false for both checks(but it can also be fine as it is), do you know why it returns a different result for different locations?

zhiics · February 17, 2020, 9:09pm

Yes, I briefly talked about the reason above, it is caused by how we store info to the map here:

github.com

apache/incubator-tvm/blob/master/src/relay/ir/alpha_equal.cc#L157


  */
 bool ExprEqual(const Expr& lhs, const Expr& rhs) {
   auto compute = [&]() {
     if (lhs.same_as(rhs)) return true;
     if (!lhs.defined() || !rhs.defined()) return false;
     auto it = equal_map_.find(lhs);
     if (it != equal_map_.end()) {
       return it->second.same_as(rhs);
     }
     if (this->VisitExpr(lhs, rhs)) {
       equal_map_[lhs] = rhs;
       return true;
     } else {
       return false;
     }
   };
   return Compare(compute(), lhs, rhs);
 }


protected:
 /*!

For example, if lhs and rhs are: %1 = add(x, y) %2 = add(x, y)

and %1’ = add(x, y)

We can mapping %1 and %2 on lhs to %1` and the rhs, but this will fail if we flip rhs and lhs because we cannot map %1’ to both %1 and %2.

tqchen · February 17, 2020, 9:15pm

I see, perhaps we need more thought process into the problem. one idea is to build equal_map in both ways(lhs->rhs and rhs->lhs) and only allow the map to happen once.

zhiics · February 17, 2020, 9:18pm

yeah, that’s something I mentioned above, basically we need to have 1:1 mapping, but some tests fail. I didn’t spend much time to look into them.

MarisaKirisame · February 18, 2020, 12:26pm

This is definitely an error - it must be commutative. @zhiics is right - we need the map to be a multimap instead of only able to have the same key once.

hypercubestart · February 24, 2020, 5:21am

It looks like the 1:1 mapping is too restricting

in test_ir_parser.py , test_seq and test_incomplete_type are failing because

lhs = # let %v_ = (); ()
rhs = # %0 = (); let %v_ = %0; %0

graph_equal(lhs, rhs) # False

the %0 in rhs are ==, but can only map to one of the () in the lhs

zhiics · February 24, 2020, 5:18pm

@hypercubestart yes, it will not allow data flow analysis. LHS->RHS is many:1 mapping, we somehow need 1:many for RHS->LHS

hypercubestart · February 24, 2020, 6:51pm

should equal_map_ be a multimap that maps both lhs -> rhs and rhs -> lhs, in this case, wouldn’t we be expanding equality to potentially structurally inequivalent expressions?

mbaret · March 10, 2020, 3:17pm

I’m not sure if this relates to the exact same problem, but I’m seeing strange behaviour from alpha_equal:

def test_trivial():
    def fan_out():
        x = relay.var("x", shape=(10, 10))
        a_1 = relay.abs(x)
        a_2 = relay.abs(x)
        out = relay.add(a_1, a_2)
        f = relay.Function([x], out)
        return f

    def normal():
        x = relay.var("x", shape=(10, 10))
        a = relay.abs(x)
        out = relay.add(a, a)
        f = relay.Function([x], out)
        return f

    print(fan_out())
    print(normal())
    print(relay.analysis.alpha_equal(fan_out(), normal()))

Which gives as output:

fn (%x: Tensor[(10, 10), float32]) {
  %0 = abs(%x);
  %1 = abs(%x);
  add(%0, %1)
}
fn (%x: Tensor[(10, 10), float32]) {
  %0 = abs(%x);
  add(%0, %0)
}
True

I’d expect the equality to fail here, but it succeeds. I see it whenever there’s this pattern of multiple identical branches fanning out and then recombining. This can be a problem when testing graph annotation as non-identical graphs pass the tests.

Is this expected behaviour or a bug? @zhiics @tqchen

zhiics · March 10, 2020, 4:12pm

@mbaret Yes, this is the same problem.