<p>Post-training quantization (PTQ) can reduce the memory footprint and latency for deep model inference, while still preserving the accuracy of the model, with only a small unlabeled calibration set and without the retraining on full training set. To calibrate a quantized model, current PTQ methods usually randomly select some unlabeled data from the training set as calibration data. However, we prove that the random data selection would result in performance instability and degradation for the activation distribution mismatch. In this paper, we attempt to solve the crucial task on optimal calibration data selection, and propose a novel one-shot calibration data selection method termed SelectQ, which selects specific data for calibration via dynamic clustering. SelectQ uses the statistic information of activation and performs layer-wise clustering to learn an activation distribution on training set. For that purpose, a new metric called Knowledge Distance is proposed to calculate the distances of activation statistics from centroids. Finally, after calibration by the selected data, quantization noise can be alleviated by mitigating the distribution mismatch within activations. Extensive experiments on ImageNet dataset show that our SelectQ increases the Top-1 accuracy of ResNet18 over 15\% in 4-bit quantization, compared to randomly sampled calibration set. It's noteworthy that SelectQ does not involve both the backward propagation and Batch Normalization parameters, which means that it has fewer limitations in practical applications. </p>
<p>Post-training quantization (PTQ) can reduce the memory footprint and latency for deep model inference, while still preserving the accuracy of the model, with only a small unlabeled calibration set and without the retraining on full training set. To calibrate a quantized model, current PTQ methods usually randomly select some unlabeled data from the training set as calibration data. However, we prove that the random data selection would result in performance instability and degradation for the activation distribution mismatch. In this paper, we attempt to solve the crucial task on optimal calibration data selection, and propose a novel one-shot calibration data selection method termed SelectQ, which selects specific data for calibration via dynamic clustering. SelectQ uses the statistic information of activation and performs layer-wise clustering to learn an activation distribution on training set. For that purpose, a new metric called Knowledge Distance is proposed to calculate the distances of activation statistics from centroids. Finally, after calibration by the selected data, quantization noise can be alleviated by mitigating the distribution mismatch within activations. Extensive experiments on ImageNet dataset show that our SelectQ increases the Top-1 accuracy of ResNet18 over 15\% in 4-bit quantization, compared to randomly sampled calibration set. It's noteworthy that SelectQ does not involve both the backward propagation and Batch Normalization parameters, which means that it has fewer limitations in practical applications. </p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.