Hi,
I have some confusion in quantization scheme,
def quantize(graph, params=None, dataset=None):
‘’’
dataset: list of dict of Var -> NDArray
The calibration dataset.
‘’’
Just shown in comments, where is the ‘dataset’ been used?
Besides, in calibrate function, it wrote
if kind == QAnnotateKind.WEIGHT:
var = expr.args[0]
assert isinstance(var, _expr.Constant)
scale = power2_scale(var.data)
else:
scale = cfg.global_scalevalid_range = 2**valid_bit
const_params[ndom_scale] = _make_const(scale / valid_range)
…
and cfg.global_scale is 8.0 by default, which means ndom_scale of INPUT/ACTIVATION have the same value(0.0625).
How did it work? I mean in other framework like TF, it uses calibration datasets and EMA algorithm to estimate appropriate scale. It doesn’t seem that tvm use this manner.