There is immediate int/uint/float, why we can’t have immediate tensor in TVM IR? On some special platforms, some small constant tensors, such as bias, is prefered to be embedded into instruction to get better performance. But I found there seems no chance to do this if we do not have constant tensor in TVM IR. Any suggestions？ Thanks.
I am curious how much it will help the performance. If it fits in the instruction, it also fits in the L1 cache without making big impact on other data.
If you REALLY want it, you can use hybrid script with micro expansion: 1. pass an constant tensor to the argument; 2. use
const_rangeto compilation time unroll the loop body.
- It’s not generic platform and this really helps.
- I noticed that schedule may have some restrictions for hybrid script, right? I need to make some other ops to compute_at it after split. But it fails in my quick test. Does it support compute_at together with split?
Again, why we can’t have constant tensor in TVM IR?