MinMaxObserver¶
-
class
torch.quantization.observer.
MinMaxObserver
(dtype=torch.quint8, qscheme=torch.per_tensor_affine, reduce_range=False, quant_min=None, quant_max=None, factory_kwargs=None, memoryless=False)[source]¶ Observer module for computing the quantization parameters based on the running min and max values.
This observer uses the tensor min/max statistics to compute the quantization parameters. The module records the running minimum and maximum of incoming tensors, and uses this statistic to compute the quantization parameters.
- Parameters
dtype – Quantized data type
qscheme – Quantization scheme to be used
reduce_range – Reduces the range of the quantized data type by 1 bit
quant_min – Minimum quantization value. If unspecified, it will follow the 8-bit setup.
quant_max – Maximum quantization value. If unspecified, it will follow the 8-bit setup.
Given running min/max as and , scale and zero point are computed as:
The running minimum/maximum is computed as:
where is the observed tensor.
The scale and zero point are then computed as:
where and are the minimum and maximum of the quantized data type.
Warning
Only works with
torch.per_tensor_symmetric
quantization schemeWarning
dtype
can only taketorch.qint8
ortorch.quint8
.Note
If the running minimum equals to the running maximum, the scale and zero_point are set to 1.0 and 0.