How to deploy tvm model for Cuda and OpenCL to create c++ API's


#1

Hi, @srkreddy1238 @tqchen @merrymercy @ZihengJiang I am deploying tvm compiled model for Cuda and OpenCL. In this what flags i need to set in c++ like.

int dtype_code = kDLFloat;  //what i need to set for Cuda?
int dtype_bits = 32;  //what i need to set for Cuda?
int dtype_lanes = 1;  //what i need to set for Cuda?
int device_type = kDLGPU; // for CUDA
int device_id = 0; //what i need to set for Cuda?
int in_ndim = 4; (1, 64, 64, 3)

t dtype_code = kDLFloat; //what i need to set for OpenCL?
t dtype_bits = 32; //what i need to set for OpenCL?
t dtype_lanes = 1; //what i need to set for OpenCL?
int device_type = kDLOpenCL; // for OpenCL
t device_id = 0; //what i need to set for OpenCL?
int in_ndim = 4; (1, 64, 64, 3)

Used below DLContext for llvm it’s working fine.
int dtype_code = kDLFloat;
int dtype_bits = 32;
int dtype_lanes = 1;
int device_type = kDLCPU;
int device_id = 0;
int in_ndim = 4;


#2

Hi, @srkreddy1238 @tqchen any comments on this?


#3

@srkreddy1238 any comments? and can i access gpu memory from cpu directly.
For cuda i am not able to access DLTensor data pointer directly from cpu.


#4

Are these questions the same as this one? Maybe someone could answer it in a centralized place?


#5

So the point is, CUDA’s native memory addresses model are isolated from CPU’s, which is sometimes called distributed memory. This is by design by NVIDIA because of the benefits listed in the wikipedia page.

I saw several of your posts, but didn’t fully understand what you are referring to. If you are trying to memset a CUDA memory, it is definitely not possible because memset is only for CPU.

If your point is device_api.gpu is not found, please compile&link your lib properly, and don’t lazy load the lib.so file.


#6

Thank you @junrushao1994 for response, issue is resolved.