Hi all,
I’d like to manually tinker with and execute the CUDA code TVM generates. The kernels generated often have integer arguments such as strides as shown below. I was wondering if there is an easy way to get the values of these arguments without having to read through the generated LLVM IR for the host code.
extern "C" __global__ void default_function_kernel0( float* __restrict__ Wh2h, float* __restrict__ lstm_scan_v0, float* __restrict__ lstm_scan_v1, float* __restrict__ Xi2h, int stride, int stride1, int stride2, int stride3, int num_step, int stride4, int stride5, \
int stride6, int stride7, int stride8, int stride9) {...}