2021
DOI: 10.48550/arxiv.2102.10462
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

BSQ: Exploring Bit-Level Sparsity for Mixed-Precision Neural Network Quantization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(14 citation statements)
references
References 16 publications
0
14
0
Order By: Relevance
“…Quantization reduces the number of data bits and parameter bits, and it is an actively studied area with a broad adoption in the industry [10,15,18,39,50]. There are many flavors of quantization including binary parameterization [10,39], low-precision fixed-point [15,32], and mixed-precision training [4,54]. While many of them require a dedicated hardware-level support, we limit our focus to the pure algorithmic solutions, and consider combining our method with an algorithmic quantization in Section 6.…”
Section: Dnn Compression Methodsmentioning
confidence: 99%
“…Quantization reduces the number of data bits and parameter bits, and it is an actively studied area with a broad adoption in the industry [10,15,18,39,50]. There are many flavors of quantization including binary parameterization [10,39], low-precision fixed-point [15,32], and mixed-precision training [4,54]. While many of them require a dedicated hardware-level support, we limit our focus to the pure algorithmic solutions, and consider combining our method with an algorithmic quantization in Section 6.…”
Section: Dnn Compression Methodsmentioning
confidence: 99%
“…However, uniformly quantizing a model to ultra low-precision can cause significant accuracy degradation. It is possible to address this with mixed-precision quantization [51,80,100,180,191,201,226,233,236,250,273]. In this approach, each layer is quantized with different bit precision, as illustrated in Figure 8.…”
Section: B Mixed-precision Quantizationmentioning
confidence: 99%
“…While the previously discussed work [10,11,25,28,39] takes a more systematic approach, others [15,35,36,38] leverage machine learning to address the challenge of mixed precision's large search space. [15,36] are more heavy-handed in their approaches.…”
Section: Mixed Precision Quantizationmentioning
confidence: 99%
“…They also show that keeping NAS and quantization as separate processes yields models that are perform worse than their combined NN+Quantization search with respect to accuracy, model size, and energy efficiency. [35,38] take a more traditional QAT approach when finding the best mixed precision schemes by learning the best mixed precision quantization parameters during QAT. [35] claims that learning the quantization function's parameters is possible if a good parameterization is chosen during training.…”
Section: Mixed Precisionmentioning
confidence: 99%
See 1 more Smart Citation