[Quantization] How to quantize nn.bias_add?

adb · December 27, 2019, 12:25am

Is there a setting to enable quantization of nn.bias_add? I have int8 bias I want to broadcast add to int8 output of nn.conv2d and nn.dense.

adb · January 3, 2020, 6:20pm

More generally, are there any resources on how to implement quantize for relay ops as in the source?

masahi · January 4, 2020, 1:32am

Not sure, I think you need one of those registration like the one below for bias_add.

RELAY_REGISTER_OP("nn.dense")
.set_attr<FForwardRewrite>("FQRealizeRewrite", DenseRealize);

Our quantization support is in a rough state at the moment (no doc, just code)