[Resolved][Relay][Fuse]Fuse generates incorrect result

The following simple code snippet will generate incorrect result:

import os
import numpy as np
import tvm
from tvm import relay
import tvm.relay.op as _op
from tvm.ir import IRModule
import tvm.relay.expr as _expr
from tvm.runtime.vm import VirtualMachine

dshape = (1, 2, 8)
data = relay.var("data", shape=dshape, dtype="float32")
o0 = _op.layout_transform(data, "NC8c", "NC")
o0 = _op.reshape(o0, [-1, 4])
o0 = _op.transpose(o0, [1, 0])
o0 = _op.split(o0, indices_or_sections=4)
func = relay.Function([data], o0[0])

mod = IRModule()
mod["main"] = func
with relay.build_config(opt_level=3):
    vm_exec = relay.vm.compile(mod, target= 'llvm')

vm = VirtualMachine(vm_exec)
ctx = tvm.cpu()
vm.init(ctx)

in_data = np.array([[[0, 1, 2, 3, 4, 5, 6, 7], [0, 1, 2, 3, 4, 5, 6, 7]]]).astype("float32")
res = vm.invoke("main", [in_data])
print(res.asnumpy())

The expected result would be [[0. 4. 0. 4.]] but actually it returns [[0. 1. 0. 1.]]. This is a code snippet from a large model. I can make the model output correct by temporarily set fusion patten of split or transpose to be kOutEWiseFusable.

Any idea how to fix this? @tqchen @yzhliu @haichen @masahi

I turned off AutoInjectiveInline and the result is correct, and the ir looks like:

fused_layout_transform_reshape_transpose_split
// attr [T_layout_trans] storage_scope = "global"
allocate T_layout_trans[float32 * 16]
// attr [T_reshape] storage_scope = "global"
allocate T_reshape[float32 * 4]
for (ax1, 0, 16) {
  T_layout_trans[ax1] = placeholder[ax1]
}
for (ax0, 0, 4) {
  T_reshape[ax0] = T_layout_trans[(ax0*4)]
}
for (ax1, 0, 4) {
  T_split_sections[ax1] = T_reshape[ax1]
}

If I turned it on, ir became:

fused_layout_transform_reshape_transpose_split
for (ax1, 0, 4) {
  T_split_sections[ax1] = placeholder[((floordiv(ax1, 2)*8) + floormod(ax1, 2))]
}

It looks like there is issue in tvm lower for this pattern.

PR: https://github.com/apache/incubator-tvm/pull/5505

1 Like