2021
DOI: 10.3390/electronics10222859
|View full text |Cite
|
Sign up to set email alerts
|

FPGA-Based Convolutional Neural Network Accelerator with Resource-Optimized Approximate Multiply-Accumulate Unit

Abstract: Convolutional neural networks (CNNs) are widely used in modern applications for their versatility and high classification accuracy. Field-programmable gate arrays (FPGAs) are considered to be suitable platforms for CNNs based on their high performance, rapid development, and reconfigurability. Although many studies have proposed methods for implementing high-performance CNN accelerators on FPGAs using optimized data types and algorithm transformations, accelerators can be optimized further by investigating mor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(6 citation statements)
references
References 24 publications
(40 reference statements)
0
5
0
Order By: Relevance
“…These include devices like the ARM Ethos NPU, BeagleBone AI, Intel Movidus NCS, NVIDIA Jetson NANO and many others. These hardware accelerators are computationally efficient, but not optimized for power consumption [8].…”
Section: Review Of Edge Computingmentioning
confidence: 99%
“…These include devices like the ARM Ethos NPU, BeagleBone AI, Intel Movidus NCS, NVIDIA Jetson NANO and many others. These hardware accelerators are computationally efficient, but not optimized for power consumption [8].…”
Section: Review Of Edge Computingmentioning
confidence: 99%
“…The authors in [26] proposed an accelerator for LeNet-5 architecture to perform handwritten digits classification. The proposed strategy is based on three major aspects; loop parallelization to utilize resources, fixed-point data optimization to find the minimum number of bits that maintains accuracy level, and finally implementing MAC approximate units through logic blocks such as look-up tables (LUTs) and flip-flops (FFs) rather than using high-precision digital signal processors (DSPs).…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, TABLE 6 summarizes the comparison with the state-of-the-art ANN training accelerators for MNIST classification [40][41][42]. Small networks are selected for ANNs to present a better comparison with our work.…”
Section: F Fractional Precisionmentioning
confidence: 99%