[RELAY] Injective ops do not work with tuple

vinx13 · January 22, 2019, 3:16am

The below function caused error python: /usr/lib/llvm-6.0/include/llvm/IR/Instructions.h:1257: void llvm::FCmpInst::AssertOK(): Assertion `getOperand(0)->getType()->isFPOrFPVectorTy() && "Invalid operand types for FCmp instruction"' failed. when translating an assert tvm/src/codegen/llvm/codegen_llvm.cc:841: placeholder == tvm_struct_get(arg2, 0, 1)
This assert is added because the input and output use the same buffer.

import tvm
import tvm.relay as relay

data_0 = relay.var("data_0", shape=(100, 5))
data_1 = relay.var("data_1", shape=(100, 7))
data_1_new = relay.transpose(data_1)
output = relay.Tuple([data_0, data_1_new])
f = relay.Function([data_0, data_1], output)
relay.build(f, 'llvm')

fn (%data_0: Tensor[(100, 5), float32],
    %data_1: Tensor[(100, 7), float32]) {
  %0 = transpose(%data_1, axes=None)
  %1 = (%data_0, %0)
  %1
}

Here is the ir after fuse_ops

fn (%data_0: Tensor[(100, 5), float32],
    %data_1: Tensor[(100, 7), float32])
    -> Tuple[Tensor[(100, 5), float32], Tensor[(7, 100), float32]] {
  %0 = fn(%p0: Tensor[(100, 5), float32],
          %p1: Tensor[(100, 7), float32])
          -> Tuple[Tensor[(100, 5), float32], Tensor[(7, 100), float32]] {
    %1 = transpose(%p1, axes=None)
    %2 = (%p0, %1)
    %2
  }
  %3 = %0(%data_0, %data_1)
  %3
}

@masahi @tqchen

masahi · January 22, 2019, 4:02am

hmm, the IR looks correct to me. Don’t know why llvm is complaining.

vinx13 · January 22, 2019, 4:22am

It is https://github.com/dmlc/tvm/blob/0806b69e3fb136226fa1dafad00bd2c606cc998d/src/pass/arg_binder.cc#L49
Since data_0 in input and output use the same buffer, an assert is added

masahi · January 22, 2019, 2:55pm

Somehow I am getting segfault by running your script on Windows + LLVM 8.0svn. I still don’t understand what the issue is. Is buffer sharing the problem?

I’m not familiar with what ArgBinder does, will look into it.

vinx13 · January 29, 2019, 8:14am

Some update:
After I commented out the assert in https://github.com/dmlc/tvm/blob/0806b69e3fb136226fa1dafad00bd2c606cc998d/src/pass/arg_binder.cc#L49 , it can compile successfully. However, I got another issue in memory planning.

In this function,

  %0 = fn(%p0: Tensor[(100, 5), float32],
          %p1: Tensor[(100, 7), float32])
          -> Tuple[Tensor[(100, 5), float32], Tensor[(7, 100), float32]] {
    %1 = transpose(%p1, axes=None)
    %2 = (%p0, %1)
    %2
  }
  %3 = %0(%data_0, %data_1)

the output is a tuple. So in graph_plam_memory.cc we will allocate a new storage for each tuple field. If the output tuple field is one of the input argument, the result of this filed is wrong because new storage is allocated.

btw Why not just make the tuple outside of the subfunction?

masahi · January 29, 2019, 10:09am

Not sure if this answers your question, but the original reason we wanted to put tuple inside a subfunction was to let concat op fuse with other injective ops.

For example, upsampling + concat becomes upsampling -> tuple -> concat in Relay. The tuple needs to be inside the subfunction to let concat fuse with upsampling op.

vinx13 · January 29, 2019, 1:06pm

I see, then we need to fix the memory planning issue

vinx13 · January 30, 2019, 6:22am

In graph_plan_memory.cc, if the function call returns a tuple, we need to check whether the tuple field is exactly the function parameter, in which case we should set token_map using existing storage token instead of creating a new one.

@tqchen do you have ideas?