Error "direct host side access to device memory is detected in ... , did you forget to bind?" when compile a onnx model with target cuda


I don’t think there is a plan to add a GPU schedule for matmul. But there is the batched_matmul schedule for GPU. You can use that from our onnx frontend.


Thanks @masahi , could you guide me towards writing the gpu schedule for matmul as it might be there for cpu ?


Hi @masahi @srkreddy1238 @FrozenGene @tqchen
facing same error mentioned above.
what line did you change so that it worked out for

 Did you forget to bind?

I’m getting the below error:

nnvm._base.NNVMError: ValueError: Direct host side access to device memory is detected in fuse_matmul_relu. Did you forget to bind?

Did not understand how to use batched_matmul for the same.
Any help here?


Hi @masahi @Hkathuria @yqwang I am facing this same issue for metal GPU.

Error: ValueError: Direct host side access to device memory is detected in addone. Did you forget to bind?

python script is:

import tvm
import os

def prepare_test_libs(base_path):
n = tvm.var(“n”)
A = tvm.placeholder((n,), name=‘A’)
B = tvm.compute(A.shape, lambda *i: A(*i) + 1.0, name=‘B’)
s = tvm.create_schedule(B.op)

Compile library as dynamic library

fadd_dylib =, [A, B], "metal", name="addone")
dylib_path = os.path.join(base_path, "test_addone_dll.dylib")

if name == “main”:

curr_path = os.path.dirname(os.path.abspath(os.path.expanduser(__file__)))
prepare_test_libs(os.path.join(curr_path, "./lib"))   

I am building it for metal device.


you need to add proper schedule such as binding axis to threads


Hi @vinx13 can you please suggest some sample for this?


something like
s[C].bind(bx, tvm.thread_axis(“blockIdx.x”))
s[C].bind(tx, tvm.thread_axis(“threadIdx.x”))