[ERROR] Half type support in NVRTC

With the PR https://github.com/dmlc/tvm/pull/4056 being merged, we should have better float16 support for cuda. It works well under NVCC compiler but it seems that having some problems under NVRTC. The compile error is here:

TVMError: Check failed: compile_res == NVRTC_SUCCESS (6 vs. 0) : 
default_program(12): error: class "__half_raw" has no suitable copy constructor

default_program(12): error: class "__half_raw" has no suitable copy constructor

default_program(16): error: class "__half_raw" has no suitable copy constructor

default_program(16): error: class "__half_raw" has no suitable copy constructor

default_program(20): error: class "__half_raw" has no suitable copy constructor

default_program(20): error: class "__half_raw" has no suitable copy constructor

6 errors detected in the compilation of "default_program".

@xyzhou Could you please take a look at it? Thank you!

Could you share your test case?
Thanks

It happens anywhere only if we use float16 in cuda.
I show a simple example that can reproduce the error.

import tvm

n = 32
A = tvm.placeholder((n, ), name='A', dtype="float16")
B = tvm.compute((n, ), lambda i: A[i] + 1, name='B')
s = tvm.create_schedule(B.op)

axis, = B.op.axis
s[B].bind(axis, tvm.thread_axis("threadIdx.x"))

func = tvm.build(s, [A, B], 'cuda')

@Hzfengsy, any updates about this? I ran into the same issue. Or, do you know how to switch to the nvcc compiler?

It is easy to switch to nvcc. Just add the following code.

@tvm.register_func
def tvm_callback_cuda_compile(code):
    """Use nvcc compiler for better perf."""
    ptx = nvcc.compile_cuda(code, target="ptx")
    return ptx

On the other part, this PR seems to fix some problems happened in Windows. I’m sorry that I do not really understand the purpose of that PR and also I have no windows computer to fix it. Maybe we should wait for the PR author’s response.

1 Like

The problem can be fixed by https://github.com/dmlc/tvm/pull/4239