2021
DOI: 10.1016/j.patcog.2020.107647
|View full text |Cite
|
Sign up to set email alerts
|

Mixed-precision quantized neural networks with progressively decreasing bitwidth

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(11 citation statements)
references
References 4 publications
0
9
0
Order By: Relevance
“…Chu et al [87] heuristically assign the wordlength of activations and weights of each layer based on the separability of their hierarchical distribution. However, for large datasets such as ImageNet, it is unaffordable to obtain a complete separability matrix.…”
Section: Mixed-precision Quantizationmentioning
confidence: 99%
See 1 more Smart Citation
“…Chu et al [87] heuristically assign the wordlength of activations and weights of each layer based on the separability of their hierarchical distribution. However, for large datasets such as ImageNet, it is unaffordable to obtain a complete separability matrix.…”
Section: Mixed-precision Quantizationmentioning
confidence: 99%
“…Another notable drawback of the discussed techniques in [58], [72], [77], [83]- [87] is that they usually perform training repeatedly, which is highly inefficient and takes a lot of time to construct the quantized model [39]. Furthermore, training requires a full-size dataset, which is often unavailable in real-world scenarios for reasons such as proprietary and privacy, especially in the case when working on an offthe-shelf pre-trained model from a community or industry for which data is no longer accessible.…”
Section: Mixed-precision Quantizationmentioning
confidence: 99%
“…Quantization [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26], as the name implies, is to let the weight and activation of the forward propagation calculation in the neural network and the 32-bit or 64-bit floating point number of the gradient value of the back propagation calculation are represented by low-bit floating point or fixed-point number, and can even be directly calculated. Figure 3 shows the basic idea of converting floating-point numbers into signed 8-bit fixed-point numbers.…”
Section: Model Quantizationmentioning
confidence: 99%
“…Model quantization [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26], as a means of compressing model, can be applied to model deployment, so that both the model size and the inference delay can be reduced. At present, the sizes of SR models become larger and larger.…”
Section: Introductionmentioning
confidence: 99%
“…Another case is to improve the performance of the low-precision model to be closer to a 32-bit floating-point model. For instance, Chu et al [21] proposed a quantization method to progressively reduce the bit-width from the input to last layer. This method is realized from an observation that feature distributions in the shallow layer contain a low quantity of class separability while in the deeper layers, the distributions have a high quantity of class separability.…”
Section: Introductionmentioning
confidence: 99%