Yangcheng Gao scite author profile

<p>Post-training quantization (PTQ) can reduce the memory footprint and latency for deep model inference, while still preserving the accuracy of the model, with only a small unlabeled calibration set and without the retraining on full training set. To calibrate a quantized model, current PTQ methods usually randomly select some unlabeled data from the training set as calibration data. However, we prove that the random data selection would result in performance instability and degradation for the activation distribution mismatch. In this paper, we attempt to solve the crucial task on optimal calibration data selection, and propose a novel one-shot calibration data selection method termed SelectQ, which selects specific data for calibration via dynamic clustering. SelectQ uses the statistic information of activation and performs layer-wise clustering to learn an activation distribution on training set. For that purpose, a new metric called Knowledge Distance is proposed to calculate the distances of activation statistics from centroids. Finally, after calibration by the selected data, quantization noise can be alleviated by mitigating the distribution mismatch within activations. Extensive experiments on ImageNet dataset show that our SelectQ increases the Top-1 accuracy of ResNet18 over 15\% in 4-bit quantization, compared to randomly sampled calibration set. It's noteworthy that SelectQ does not involve both the backward propagation and Batch Normalization parameters, which means that it has fewer limitations in practical applications. </p>

show abstract

SelectQ: Calibration Data Selection for Post-Training Quantization

Zhang¹,

Gao²,

Fan³

et al. 2022

Preprint

0

View full text Add to dashboard Cite

<p>Post-training quantization (PTQ) can reduce the memory footprint and latency for deep model inference, while still preserving the accuracy of the model, with only a small unlabeled calibration set and without the retraining on full training set. To calibrate a quantized model, current PTQ methods usually randomly select some unlabeled data from the training set as calibration data. However, we prove that the random data selection would result in performance instability and degradation for the activation distribution mismatch. In this paper, we attempt to solve the crucial task on optimal calibration data selection, and propose a novel one-shot calibration data selection method termed SelectQ, which selects specific data for calibration via dynamic clustering. SelectQ uses the statistic information of activation and performs layer-wise clustering to learn an activation distribution on training set. For that purpose, a new metric called Knowledge Distance is proposed to calculate the distances of activation statistics from centroids. Finally, after calibration by the selected data, quantization noise can be alleviated by mitigating the distribution mismatch within activations. Extensive experiments on ImageNet dataset show that our SelectQ increases the Top-1 accuracy of ResNet18 over 15\% in 4-bit quantization, compared to randomly sampled calibration set. It's noteworthy that SelectQ does not involve both the backward propagation and Batch Normalization parameters, which means that it has fewer limitations in practical applications. </p>

show abstract