Problem
Generated code for my backend is dependent on the values within weight & bias tensors. This appears to be an issue at both runtime and compile time because GetFunction will query for “func_name” and will map distinctly different operators to the same code implementation.
Example
My simple sequential network consists of 512 unit dense layers with bias + act. A few of the 512 unit dense layers all query the same code for “fused_nn_dense_add_sigmoid_kernel0”. Because weight + bias values for each op need to change the actual code that is generated AOT, inference returns an incorrect result every time.
Question
Is there any way to disable TVM’s assumption that op nodes share the same code?
In my opinion, the only correct time to apply this optimization is when code has already been generated for each op and can be 1:1 mapped with another op.