2019
DOI: 10.48550/arxiv.1902.01917
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Same, Same But Different - Recovering Neural Network Quantization Error Through Weight Factorization

Abstract: Quantization of neural networks has become common practice, driven by the need for efficient implementations of deep neural networks on embedded devices. In this paper, we exploit an oft-overlooked degree of freedom in most networks -for a given layer, individual output channels can be scaled by any factor provided that the corresponding weights of the next layer are inversely scaled. Therefore, a given network has many factorizations which change the weights of the network without changing its function. We pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
18
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(18 citation statements)
references
References 12 publications
(17 reference statements)
0
18
0
Order By: Relevance
“…Activation Equalization. In this step, we equalize activation ranges per channel similarly to the methods presented in [23,28]. Here, we set the scale-perchannel factor according to the value of the threshold that is selected per-tensor.…”
Section: Shift Negative Correction (Snc)mentioning
confidence: 99%
See 3 more Smart Citations
“…Activation Equalization. In this step, we equalize activation ranges per channel similarly to the methods presented in [23,28]. Here, we set the scale-perchannel factor according to the value of the threshold that is selected per-tensor.…”
Section: Shift Negative Correction (Snc)mentioning
confidence: 99%
“…The motivation to use this scaling factor in order to equalize the activation ranges is to use the maximum range of the quantization bins for each channel (see Figure 4). The authors in [23,28] suggest to perform channel equalization by exploiting the positive scale equivariance property of activation functions. It holds for any piece-wise linear activation function in its relaxed form: φ (Sx) = S φ (x) where φ is a piece-wise linear function, φ is its modified version that fits this requirement and S = diag (s) is a diagonal matrix with s k denoting the scale factor for channel k.…”
Section: Shift Negative Correction (Snc)mentioning
confidence: 99%
See 2 more Smart Citations
“…Banner et al (2018) derived an analytical expression for the approximation of the optimal threshold under the assumption of Laplacian or Gaussian distribution of the weights, which allows to achieve single percent accuracy reduction for 8-bit weights and 4-bit activations. Meller et al (2019) showed that the equalization of channels and the removal of outliers allowed to improve quantization quality. Choukroun et al (2019) used one-dimensional exact line-search to evaluate optimal quantization threshold, demonstrating state-ofthe-art results for 4-bit weight and activation quantization.…”
Section: Related Workmentioning
confidence: 99%