The platform will undergo maintenance on Sep 14 at about 9:30 AM EST and will be unavailable for approximately 1 hour.
2023
DOI: 10.3390/math11092112
|View full text |Cite
|
Sign up to set email alerts
|

Neuron-by-Neuron Quantization for Efficient Low-Bit QNN Training

Abstract: Quantized neural networks (QNNs) are widely used to achieve computationally efficient solutions to recognition problems. Overall, eight-bit QNNs have almost the same accuracy as full-precision networks, but working several times faster. However, the networks with lower quantization levels demonstrate inferior accuracy in comparison to their classical analogs. To solve this issue, a number of quantization-aware training (QAT) approaches were proposed. In this paper, we study QAT approaches for two- to eight-bit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 37 publications
0
5
0
Order By: Relevance
“…The top 1 and top 5 accuracies for ImageNet are shown in Table 7. The best quantization parameters (N x , N w ) for 4.6-bit models were (23,23). As for CIFAR-10, eight-bit models have comparable accuracy to full precision ones, while four bits give considerably lower results.…”
Section: Training Resultsmentioning
confidence: 93%
See 3 more Smart Citations
“…The top 1 and top 5 accuracies for ImageNet are shown in Table 7. The best quantization parameters (N x , N w ) for 4.6-bit models were (23,23). As for CIFAR-10, eight-bit models have comparable accuracy to full precision ones, while four bits give considerably lower results.…”
Section: Training Resultsmentioning
confidence: 93%
“…Notably, recent research [22] has demonstrated the ability to train four-bit networks for some tasks with minimal accuracy loss. Additionally, there is an ongoing development of quantization-aware training (QAT) techniques for scenarios where training data are available [23,24]. These approaches show significant potential for advancing the field of computationally efficient recognition and facilitating the implementation of QNNs in practical applications.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In the initial convolutional layers, L2 regularization is employed to penalize the error function more, aiming to reduce the complexity of the model and prevent overfitting. Along with data augmentation, all of the aforementioned measures are intended to reduce the model's variance [24,25].…”
Section: Short Answermentioning
confidence: 99%