I want to use multiprocessing to try different schedules simultaneously, but I find that I can’t get the device context in forked processes. The tvm.context("cuda", 0).exist
just gives me False, while in the parent process, this is True. In the parent process I have allocated some NDArrays (so maybe I have opened CUDA runtime, I am not sure…), is this the cause of the trouble?
cuda itself doesn’t work very well with multiprocessing. are you using spawn or fork?
I am using the default context of multiprocessing, so it should be ‘fork’, I guess.
seems cuda works better with spawn…(i don’t quite understand why)