Zero-shot Adversarial Quantization

Liu, Yuang; Zhang, Wei; Wang, Jun

doi:10.48550/arxiv.2103.15263

Cited by 1 publication

(7 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Note that nwma means n-bit quantization for weights and m-bit quantization for activations. As baselines, we selected ZeroQ [15], ZAQ [14], and GDFQ [13] as the important previous works on generative data-free quantization. In addition, we implemented Mixup [50] and Cutmix [51] on top of GDFQ, which are data augmentation schemes that mix input images.…”

Section: Quantization Results Comparisonmentioning

confidence: 99%

“…where the first term L CE guides the generator to output clearly classifiable samples, and the second term L BN S aligns the batch-normalization statistics of the synthetic samples with those of the batch-normalization layers in the full-precision model. In another previous work ZAQ [14],…”

Section: Baseline Generative Data-free Quantizationmentioning

confidence: 92%

“…In addition, DSG [28] further suggests relaxing the batch-normalization statistics alignment to generate more diverse samples. ZAQ [14] adopted adversarial training of the generator on the quantization problem and introduced intermediate feature matching between the full-precision and quantized model. However, none of these considered aiming to synthesize boundary supporting samples of the full-precision model.…”

Section: Data-free Compressionmentioning

confidence: 99%

“…Recent generative data-free quantization schemes [13,14] employ a GAN-like generator to create synthetic samples. In the absence of the original training samples, the generator G attempts to generate synthetic samples so that the quantized model Q can mimic the behavior of the full-precision model P .…”

Section: Baseline Generative Data-free Quantizationmentioning

confidence: 99%

“…Therefore, data-free quantization is a natural direction to achieve a highly accurate quantized model without accessing any training data. Among many excellent prior studies [9,10,11,12], generative methods [13,14,15] have recently been drawing much attention due to their superior performance. Generative methods successfully generate synthetic samples that resemble the distribution of the original dataset and achieve high accuracy using information from the pretrained full-precision network, such as batch-normalization statistics [15,13] or intermediate features [14].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Qimera: Data-free Quantization with Synthetic Boundary Supporting Samples

Choi¹,

Hong²,

Park³

et al. 2021

Preprint

View full text Add to dashboard Cite

Model quantization is known as a promising method to compress deep neural networks, especially for inferences on lightweight mobile or edge devices. However, model quantization usually requires access to the original training data to maintain the accuracy of the full-precision models, which is often infeasible in real-world scenarios for security and privacy issues. A popular approach to perform quantization without access to the original data is to use synthetically generated samples, based on batch-normalization statistics or adversarial learning. However, the drawback of such approaches is that they primarily rely on random noise input to the generator to attain diversity of the synthetic samples. We find that this is often insufficient to capture the distribution of the original data, especially around the decision boundaries. To this end, we propose Qimera, a method that uses superposed latent embeddings to generate synthetic boundary supporting samples. For the superposed embeddings to better reflect the original distribution, we also propose using an additional disentanglement mapping layer and extracting information from the full-precision model. The experimental results show that Qimera achieves state-of-the-art performances for various settings on data-free quantization. Code is available at https://github.com/iamkanghyunchoi/qimera.

show abstract