I’ve been working on separating dynamic ops from static ops. https://github.com/apache/incubator-tvm/pull/5826 The dynamic ops running with the Virtual Machine on either GPU or CPU without any issues, as long as I use only one backend.
I’m running into a bit of complication with this code though:
for target, ctx in ctx_list():
for kind in ["vm", "debug"]:
mod = tvm.ir.IRModule.from_expr(func)
intrp = relay.create_executor(kind, mod=mod, ctx=ctx, target=target)
op_res = intrp.evaluate()(x_data, np.array(newshape))
tvm.testing.assert_allclose(op_res.asnumpy(), ref_res, rtol=1e-5)
If I have a GPU in my system, that loop will run the test on CPU first, and then attempt to run on GPU. When it hits the GPU run, I get this error, it seems the VM thinks I should still be passing in CPU data:
TVMError: Check failed: ret == 0 (-1 vs. 0) : Assert fail: (1 == tvm_struct_get(arg0, 0, 10)), Argument arg0.device_type has an unsatisfied constraint
If I run it just on GPU, I don’t hit the error. I also don’t see the error with the Graph Runtime or the debug backend.
This looks like an issue with some global state inside the VirtualMachine, are you guys aware of anything that could cause this behavior?