[BUG] TVM Generate wrong OpenCL code

Hello!

I am currently constructing a network using TOPI. The problem is that TVM generates incorrect OpenCL Kernel Code for certain conditions.

For example, running the code below using the current version of TVM will cause problems.

import numpy as np
import topi
import tvm 

tgt = tvm.target.create("opencl")
tgt_host = 'llvm'
ctx = tvm.runtme.cl()

### Shape
input_shape = (1, 64, 224, 224)
kernel_shape = (64,64,3,3)

### make placeholder
ph_input = tvm.te.placeholder( input_shape )
ph_p1 = tvm.te.placeholder( kernel_shape )

### data
sample = np.random.uniform(-1, 1, size=input_shape).astype("float32")
p1 = np.random.uniform(-1, 1, size=kernel_shape).astype("float32")

### Allocate Memory
tvm_input = tvm.nd.array( sample , ctx )
tvm_p1 = tvm.nd.array( p1, ctx )


with tgt:
    conv = topi.topi.mali.conv2d_nchw_spatial_pack( ph_input, ph_p1, [1,1], [1,1,1,1], [1,1],"float32" )
    sch = topi.mali.schedule_conv2d_nchw_spatial_pack( [conv] )
    mod = tvm.build(sch, [ph_input, ph_p1 ], tgt, tgt_host )

mod( tvm_input, tvm_p1 )

When running the above code, the following error is occured.

tvm._ffi.base.TVMError: Traceback (most recent call last):
  [bt] (1) /home/alpha930/Desktop/TVM/tvm/build/libtvm.so(TVMFuncCall+0x65) [0x7fed55c962f5]
  [bt] (0) /home/alpha930/Desktop/TVM/tvm/build/libtvm.so(+0x145b84f) [0x7fed55caa84f]
            ^
  vstore4(((float44)(((__global float*)placeholder)[_1.s0],((__global float*)placeholder)[_1.s1],((__global float*)placeholder)[_1.s2],((__global float*)placeholder)[_1.s3])), 0, (__global float*)kernel_vec + ((((int)get_group_id(0)) * 1024) + (((int)get_local_id(0)) * 4)));
  File "/home/alpha930/Desktop/TVM/tvm/src/runtime/opencl/opencl_module.cc", line 220
  File "/home/alpha930/Desktop/TVM/tvm/src/runtime/library_module.cc", line 78
TVMError: Check failed: ret == 0 (-1 vs. 0) : OpenCL build error for device=0x1563660<kernel>:3:13: error: use of undeclared identifier 'float44'; did you mean 'float4'?
cl_kernel.h:140:1: note: 'float4' declared here
__NV_DECLARE_VECT_TYPES(float,           float)
^
cl_kernel.h:115:55: note: expanded from macro '__NV_DECLARE_VECT_TYPES'
typedef __attribute__(( ext_vector_type(4)  ))  CTYPE CLTYPE##4;  \
                                                      ^
<scratch space>:49:1: note: expanded from here
float4
^

terminate called after throwing an instance of 'dmlc::Error'
  what():  [14:11:54] /home/alpha930/Desktop/TVM/tvm/src/runtime/workspace_pool.cc:118: Check failed: allocated_.size() == 1 (4 vs. 1) : 
Stack trace:
  [bt] (0) /home/alpha930/Desktop/TVM/tvm/build/libtvm.so(tvm::runtime::WorkspacePool::Pool::Release(DLContext, tvm::runtime::DeviceAPI*)+0x815) [0x7fed55cda015]
  [bt] (1) /home/alpha930/Desktop/TVM/tvm/build/libtvm.so(tvm::runtime::WorkspacePool::~WorkspacePool()+0x37) [0x7fed55cd8327]
  [bt] (2) /home/alpha930/Desktop/TVM/tvm/build/libtvm.so(tvm::runtime::cl::OpenCLThreadEntry::~OpenCLThreadEntry()+0xd) [0x7fed55d529dd]
  [bt] (3) /lib/x86_64-linux-gnu/libc.so.6(__call_tls_dtors+0x3f) [0x7fed7bd3a8af]
  [bt] (4) /lib/x86_64-linux-gnu/libc.so.6(+0x43117) [0x7fed7bd3a117]
  [bt] (5) /lib/x86_64-linux-gnu/libc.so.6(+0x4313a) [0x7fed7bd3a13a]
  [bt] (6) /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xee) [0x7fed7bd18b9e]
  [bt] (7) python3(_start+0x2a) [0x5b250a]

In my opinion, as we can see in the error log above, TVM generates the wrong opencl kernel code called float44 and below code segment is kernel code that TVM generate.

__kernel void default_function_kernel1(__global float4* restrict kernel_vec, __global void* restrict placeholder) {
  int4 _1 = (int4)(((((((((int)get_group_id(0)) * 256) + ((int)get_local_id(0))) / 576) * 2304) + (((((int)get_group_id(0)) * 256) + ((int)get_local_id(0))) %
  vstore4(((float44)(((__global float*)placeholder)[_1.s0],((__global float*)placeholder)[_1.s1],((__global float*)placeholder)[_1.s2],((__global float*)place
}

The problem does not occur in previous versions of TVM, but the above problem occurs in current versions of TVM. Is this a bug inside TVM?

Is it possible to identify which commit causes this issue?

1 Like

No, I am not sure which commit causese this issue.

Face the same problem, still not sure the reason, seems a bug…

2 Likes