The platform will undergo maintenance on Sep 14 at about 9:30 AM EST and will be unavailable for approximately 1 hour.
2020 25th International Conference on Pattern Recognition (ICPR) 2021
DOI: 10.1109/icpr48806.2021.9412841
|View full text |Cite
|
Sign up to set email alerts
|

Fast Implementation of 4-bit Convolutional Neural Networks for Mobile Devices

Abstract: Quantized low-precision neural networks are very popular because they require less computational resources for inference and can provide high performance, which is vital for real-time and embedded recognition systems. However, their advantages are apparent for FPGA and ASIC devices, while general-purpose processor architectures are not always able to perform low-bit integer computations efficiently. The most frequently used low-precision neural network model for mobile central processors is an 8-bit quantized … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
21
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 15 publications
(23 citation statements)
references
References 25 publications
0
21
0
Order By: Relevance
“…The first term of (3) presents matrix multiplication of quantized matrices: 8-bit with 32-bit product in case of gemmlowp and 4-bit with 16-bit product in case of [20]. The second and third terms do not depend on j and i respectively, so they are easier to compute: in terms of algorithmic complexity, the first term requires O(mnk), the second -O(mk), the third -O(nk), and the fourth -O(1) operations.…”
Section: B Integer Gemmmentioning
confidence: 99%
See 4 more Smart Citations
“…The first term of (3) presents matrix multiplication of quantized matrices: 8-bit with 32-bit product in case of gemmlowp and 4-bit with 16-bit product in case of [20]. The second and third terms do not depend on j and i respectively, so they are easier to compute: in terms of algorithmic complexity, the first term requires O(mnk), the second -O(mk), the third -O(nk), and the fourth -O(1) operations.…”
Section: B Integer Gemmmentioning
confidence: 99%
“…In GeMM-based convolution it limits the number of channels in the input feature map [20]. Let us consider convolution with H k ×W k kernel.…”
Section: B Integer Gemmmentioning
confidence: 99%
See 3 more Smart Citations