I’m encountering a very annoying error when building a convolutional model. When using small input sizes, I can build using opt-level=3 and everything works great. However, for larger inputs, opt-level>=1 causes the error:
CUDA: Check failed: e == cudaSuccess || e == cudaErrorCudartUnloading: an illegal memory access was encountered
Although this may seem like my GPU is running out of memory, everything works fine when opt_level=0. Since the only extra pass for opt_level=1 is operator fusion, I suspect operations are being fused in a way that causes cuda to run out of memory. Any thoughts on how to work around this?