Quantizing from float32 -> int8, I need to have a few items passed to CodeGen
These are:
- Accumulator values during data aware calibration – For any matmul, what is the min/max of the output for both float32 and int8 fixed point
- Min/Max for each of inputs/weights/bias tensors
- Log2 scale factors for any tensors
Looks like some of these are only implemented for int32 -> int8 right now. Is this correct?
Would I be able to annotate the graph with this information somehow?