I’ve got my test app working with cl kernels in a source file, now I want to use an offline tool to compile those kernels and then load the binary at runtime to avoid the compilation cost at runtime - how can I do this?
In order to benefit from ahead-of-time compilation and OpenCL, you would need to know your workload exactly. In this case, is there a difference between what you need and what you can get by just running a warmup run of your known workload to pre-JIT everything?
Thanks for your response. Yes I think there is a difference in that I don’t want the overhead cost of the warmup. Or rather, I want to do that once, and then use the compiled binaries from that point on.