/tmp/tmps94u7qu7/my_kernel.cu(6908): Error: Formal parameter space overflowed (4544 bytes required, max 4096 bytes allowed) in function fused_reshape_gather_nd_reshape_floor_mod_less_zeros_like_where_reshape_gather_n_17367594856123799618__kernel0
this fused function has 576 params and seems each param requires 8 bytes, and the total bytes required exceeds the limited 4096.
NVIDIA suggests that we should pass a struct to avoid passing too many params. I’m not familiar with the codegen process, could this be fixed easily?