2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2020
DOI: 10.1109/cvprw50498.2020.00356
|View full text |Cite
|
Sign up to set email alerts
|

LSQ+: Improving low-bit quantization through learnable offsets and better initialization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
103
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 119 publications
(115 citation statements)
references
References 9 publications
0
103
0
Order By: Relevance
“…In recent years researchers have been able to push more and more the limits of quantization. In particular, there is a growing corpora of work showing that extreme quantization of the weights down to 4bit is possible, while minimally affecting the prediction accuracy of the network [8]- [14]. 4bit quantization offers directly 8× compression gains and similar improvements in computational efficiency.…”
Section: A Techniques For Reducing the Information Content Of The Dnns Parametersmentioning
confidence: 99%
“…In recent years researchers have been able to push more and more the limits of quantization. In particular, there is a growing corpora of work showing that extreme quantization of the weights down to 4bit is possible, while minimally affecting the prediction accuracy of the network [8]- [14]. 4bit quantization offers directly 8× compression gains and similar improvements in computational efficiency.…”
Section: A Techniques For Reducing the Information Content Of The Dnns Parametersmentioning
confidence: 99%
“…Although non-uniform quantization provides a better performance for a wide range of input signal variances [6,7] and advanced dual-mode asymptotic solutions 2 of 17 are developed [8,9], simple uniform quantization [6,7,[10][11][12] is the first choice when the simplicity of the system is one of the major goals. Thus, uniform quantization has been widely applied for quantizing parameters of neural networks (i.e., for neural network compression) [13][14][15][16][17][18], and different solutions have been considered, e.g., using 8-bits [13], 4-bits [14], or 2-bits [15][16][17][18]; further, non-uniform quantization has also been used [19][20][21]. It has been found that quantizing network parameters using 8-bits [13] or 16-bits [19] enable slightly lower performance when compared to the full precision case, mainly due to the ability of quantizers to achieve high quality reconstructed data.…”
Section: Introductionmentioning
confidence: 99%
“…It has been found that quantizing network parameters using 8-bits [13] or 16-bits [19] enable slightly lower performance when compared to the full precision case, mainly due to the ability of quantizers to achieve high quality reconstructed data. Further, in the case of applying quantizers with smaller resolution, e.g., with 4-bits [14] or 2-bits [15][16][17][18]20,21], performance degradation has been observed; however, the achieved results are still comparable, accompanied with a significantly high level of compression. Eventually, significant attention was paid to the development of binary quantizer models to compress neural networks [22][23][24][25][26], whose attractiveness lies in the amount of compression that can be achieved, with a goal to preserve competitive performance achievements.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations