Quantization - Current state


Hi, I am trying to understand the current state of quantization.

  • From unquantized graph - Used this to generate a quantized graph - https://gist.github.com/ZihengJiang/bcabe46a712a417a01a6967d4430b6b5. Does this only support symmetric quantization as of today? If I understand correctly, asymmetric quantization will transform original FP32 conv into int8 conv + a bunch of other operators as well.

  • From quantized graph - Can we read in an already quantized graph from a framework and convert to Relay? This also seems non-trivial because as opposed to typical one-to-one or one-to-many transformation in Relay parser, this task might require many-to-many transformation.

@ziheng @eqy