[RELAY] Injective ops do not work with tuple

The below function caused error python: /usr/lib/llvm-6.0/include/llvm/IR/Instructions.h:1257: void llvm::FCmpInst::AssertOK(): Assertion `getOperand(0)->getType()->isFPOrFPVectorTy() && "Invalid operand types for FCmp instruction"' failed. when translating an assert tvm/src/codegen/llvm/codegen_llvm.cc:841: placeholder == tvm_struct_get(arg2, 0, 1)
This assert is added because the input and output use the same buffer.

import tvm
import tvm.relay as relay

data_0 = relay.var("data_0", shape=(100, 5))
data_1 = relay.var("data_1", shape=(100, 7))
data_1_new = relay.transpose(data_1)
output = relay.Tuple([data_0, data_1_new])
f = relay.Function([data_0, data_1], output)
relay.build(f, 'llvm')
fn (%data_0: Tensor[(100, 5), float32],
    %data_1: Tensor[(100, 7), float32]) {
  %0 = transpose(%data_1, axes=None)
  %1 = (%data_0, %0)
  %1
}

Here is the ir after fuse_ops

fn (%data_0: Tensor[(100, 5), float32],
    %data_1: Tensor[(100, 7), float32])
    -> Tuple[Tensor[(100, 5), float32], Tensor[(7, 100), float32]] {
  %0 = fn(%p0: Tensor[(100, 5), float32],
          %p1: Tensor[(100, 7), float32])
          -> Tuple[Tensor[(100, 5), float32], Tensor[(7, 100), float32]] {
    %1 = transpose(%p1, axes=None)
    %2 = (%p0, %1)
    %2
  }
  %3 = %0(%data_0, %data_1)
  %3
}

@masahi @tqchen

hmm, the IR looks correct to me. Don’t know why llvm is complaining.

It is https://github.com/dmlc/tvm/blob/0806b69e3fb136226fa1dafad00bd2c606cc998d/src/pass/arg_binder.cc#L49
Since data_0 in input and output use the same buffer, an assert is added

Somehow I am getting segfault by running your script on Windows + LLVM 8.0svn. I still don’t understand what the issue is. Is buffer sharing the problem?

I’m not familiar with what ArgBinder does, will look into it.

Some update:
After I commented out the assert in https://github.com/dmlc/tvm/blob/0806b69e3fb136226fa1dafad00bd2c606cc998d/src/pass/arg_binder.cc#L49 , it can compile successfully. However, I got another issue in memory planning.

In this function,

  %0 = fn(%p0: Tensor[(100, 5), float32],
          %p1: Tensor[(100, 7), float32])
          -> Tuple[Tensor[(100, 5), float32], Tensor[(7, 100), float32]] {
    %1 = transpose(%p1, axes=None)
    %2 = (%p0, %1)
    %2
  }
  %3 = %0(%data_0, %data_1)

the output is a tuple. So in graph_plam_memory.cc we will allocate a new storage for each tuple field. If the output tuple field is one of the input argument, the result of this filed is wrong because new storage is allocated.

btw Why not just make the tuple outside of the subfunction?

Not sure if this answers your question, but the original reason we wanted to put tuple inside a subfunction was to let concat op fuse with other injective ops.

For example, upsampling + concat becomes upsampling -> tuple -> concat in Relay. The tuple needs to be inside the subfunction to let concat fuse with upsampling op.

I see, then we need to fix the memory planning issue

In graph_plan_memory.cc, if the function call returns a tuple, we need to check whether the tuple field is exactly the function parameter, in which case we should set token_map using existing storage token instead of creating a new one.

@tqchen do you have ideas?

1 Like