Is there a setting to enable quantization of nn.bias_add? I have int8 bias I want to broadcast add to int8 output of nn.conv2d and nn.dense.
More generally, are there any resources on how to implement quantize for relay ops as in the source?
Not sure, I think you need one of those registration like the one below for bias_add.
RELAY_REGISTER_OP("nn.dense")
.set_attr<FForwardRewrite>("FQRealizeRewrite", DenseRealize);
Our quantization support is in a rough state at the moment (no doc, just code)