2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00448
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss

Abstract: Reducing bit-widths of activations and weights of deep networks makes it efficient to compute and store them in memory, which is crucial in their deployments to resourcelimited devices, such as mobile phones. However, decreasing bit-widths with quantization generally yields drastically degraded accuracy. To tackle this problem, we propose to learn to quantize activations and weights via a trainable quantizer that transforms and discretizes them. Specifically, we parameterize the quantization intervals and obta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
225
0
3

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 304 publications
(232 citation statements)
references
References 27 publications
4
225
0
3
Order By: Relevance
“…But lower bit quantization faces more challenges on accuracy. To address this problem, [7,20] try to optimize the clipping value or quantization interval for the task specific loss in an end-to-end manner. [36,46,39] apply different techniques to find optimal bit-widths for each layer.…”
Section: Network Quantizationmentioning
confidence: 99%
See 3 more Smart Citations
“…But lower bit quantization faces more challenges on accuracy. To address this problem, [7,20] try to optimize the clipping value or quantization interval for the task specific loss in an end-to-end manner. [36,46,39] apply different techniques to find optimal bit-widths for each layer.…”
Section: Network Quantizationmentioning
confidence: 99%
“…[36,46,39] apply different techniques to find optimal bit-widths for each layer. [42] and [20] optimize the training process with incremental and progressive quantization. [29] and [16] adjust the network structure to adapt to quantization.…”
Section: Network Quantizationmentioning
confidence: 99%
See 2 more Smart Citations
“…As shown in the figure, when the stride changes from 1 to 2, the number of redundant cells is fixed to 1, while the distance between two adjacent PEs (dPEs) is doubled. Although large stride or odd number of input feature bit-width can degrade the effectiveness of the proposed DWM-based cell array, since stride is generally less than 3 [21] and the 4-bit input and weight configurations can present the full-precision accuracy [28], [29] for recent CNNs, the proposed systolic DWM-based array is effectively reconfigurable for the different layer configuration.…”
Section: ) Generalization Of Dwm Input and Weight Busmentioning
confidence: 99%