How to limit VGPRs
and SGPRs
for ROCm?
It’s been a while, but when you set the maximum workgroup size it will cause LLVM to limit the number of registers so that the given maximum workgroup size can be accomodated. This PR sets this for the AMDGPU codegen:
Last time I looked it was not configurable but the codegen will query the device properties.
Best regards
Thomas