“…Note that nwma means n-bit quantization for weights and m-bit quantization for activations. As baselines, we selected ZeroQ [3], ZAQ [7], and GDFQ [5] as the important previous works on generative data-free quantization. We report top-1 accuracy for each experiment.…”
Section: Resultsmentioning
confidence: 99%
“…DSG [4] and IntraQ [6] focus on increasing the diversity of data impressions to expand the coverage space of synthetic samples. ZAQ [7] mainly works by exploring and transferring information of individual samples and their correlations, mitigating the gap between the full-precision and quantized models. Qimera [8] employs superposed latent embeddings to create boundary supporting samples, which is ignored in conventional methods.…”
Section: Related Workmentioning
confidence: 99%
“…From an adversarial learning perspective, G aims to maximize the model discrepancy, while Q is supposed to minimize it. To achieve this, motivated by [7], we explore the input gradient of alternative data between the student and teacher to characterize how small changes at each input pixel affect the model output. We propose an adversarial exploration method with input gradient as the medium of transfer.…”
Data-free quantization has recently been a promising method to perform quantization without access to the original data. However, the drawback of such approaches is the homogenization of synthetic data due to low efficiency for diverse data generation and the performance collapse of the generator. To alleviate the above issue, we propose a novel Meta-BNS for adversarial data-free quantization scheme which consists of Meta-BNS module and adversarial exploration module. Meta-BNS module automatically learns an enhancement coefficient matrix function for BN loss module to provide a suitable constrain on the generator. Adversarial exploration module leverages minimax game between the generator and quantized model via input gradient to encourage the generator to learn high-dimensional and complex real data distribution. The experimental results show that our method achieves state-of-the-art performance for various settings on data-free quantization.
“…Note that nwma means n-bit quantization for weights and m-bit quantization for activations. As baselines, we selected ZeroQ [3], ZAQ [7], and GDFQ [5] as the important previous works on generative data-free quantization. We report top-1 accuracy for each experiment.…”
Section: Resultsmentioning
confidence: 99%
“…DSG [4] and IntraQ [6] focus on increasing the diversity of data impressions to expand the coverage space of synthetic samples. ZAQ [7] mainly works by exploring and transferring information of individual samples and their correlations, mitigating the gap between the full-precision and quantized models. Qimera [8] employs superposed latent embeddings to create boundary supporting samples, which is ignored in conventional methods.…”
Section: Related Workmentioning
confidence: 99%
“…From an adversarial learning perspective, G aims to maximize the model discrepancy, while Q is supposed to minimize it. To achieve this, motivated by [7], we explore the input gradient of alternative data between the student and teacher to characterize how small changes at each input pixel affect the model output. We propose an adversarial exploration method with input gradient as the medium of transfer.…”
Data-free quantization has recently been a promising method to perform quantization without access to the original data. However, the drawback of such approaches is the homogenization of synthetic data due to low efficiency for diverse data generation and the performance collapse of the generator. To alleviate the above issue, we propose a novel Meta-BNS for adversarial data-free quantization scheme which consists of Meta-BNS module and adversarial exploration module. Meta-BNS module automatically learns an enhancement coefficient matrix function for BN loss module to provide a suitable constrain on the generator. Adversarial exploration module leverages minimax game between the generator and quantized model via input gradient to encourage the generator to learn high-dimensional and complex real data distribution. The experimental results show that our method achieves state-of-the-art performance for various settings on data-free quantization.
“…[62] use weight (only) quantization for medical image segmentation as an attempt to remove noise and not for computational efficiency. The recent work of [39] shows both a sophisticated post-training quantization scheme and includes fine-tuned semantic segmentation results. Again a significant degradation is observed when going from 6 to 4 bits.…”
Convolutional Neural Networks (CNNs) are known for requiring extensive computational resources, and quantization is among the best and most common methods for compressing them. While aggressive quantization (i.e., less than 4-bits) performs well for classification, it may cause severe performance degradation in image-to-image tasks such as semantic segmentation and depth estimation. In this paper, we propose Wavelet Compressed Convolution (WCC)-a novel approach for high-resolution activation maps compression integrated with point-wise convolutions, which are the main computational cost of modern architectures. To this end, we use an efficient and hardware-friendly Haar-wavelet transform, known for its effectiveness in image compression, and define the convolution on the compressed activation map. We experiment on various tasks, that benefit from high-resolution input, and by combining WCC with light quantization, we achieve compression rates equivalent to 1-4bit activation quantization with relatively small and much more graceful degradation in performance. * Contributed equally.Preprint. Under review.
“…Some researchers obtained synthetic samples that resemble the distribution of the authentic sample by using information from the pre-trained full-precision network, such as batchnormalization (BN) statistics. These approaches can be categorized into noise-optimized data-free quantization [8,[12][13][14] and generative data-free quantization [9,[15][16][17]. The former initializes a sample that satisfies the gaussian distribution, and the dimension of the sample is consistent with the size of a real sample.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.