2019
DOI: 10.3390/electronics8030281
|View full text |Cite
|
Sign up to set email alerts
|

An FPGA-Based CNN Accelerator Integrating Depthwise Separable Convolution

Abstract: The Convolutional Neural Network (CNN) has been used in many fields and has achieved remarkable results, such as image classification, face detection, and speech recognition. Compared to GPU (graphics processing unit) and ASIC, a FPGA (field programmable gate array)-based CNN accelerator has great advantages due to its low power consumption and reconfigurable property. However, FPGA’s extremely limited resources and CNN’s huge amount of parameters and computational complexity pose great challenges to the desig… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
56
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 77 publications
(65 citation statements)
references
References 24 publications
0
56
0
Order By: Relevance
“…This is especially common for FPGAs [28], [29], and used for example in hand tracking [30] or language processing [31]. Special attention is also on adapting key principles in neural network archi- tectures, such as depth-wise convolutions for FPGAs [32] or quantized-based operations, such as binary neural networks [33]. In contrast to general, "all purpose" GPUs, tensorprocessing units (TPUs) are specialised on matrix operations, such as multiplications and additions, as massively used in neural networks.…”
Section: Related Workmentioning
confidence: 99%
“…This is especially common for FPGAs [28], [29], and used for example in hand tracking [30] or language processing [31]. Special attention is also on adapting key principles in neural network archi- tectures, such as depth-wise convolutions for FPGAs [32] or quantized-based operations, such as binary neural networks [33]. In contrast to general, "all purpose" GPUs, tensorprocessing units (TPUs) are specialised on matrix operations, such as multiplications and additions, as massively used in neural networks.…”
Section: Related Workmentioning
confidence: 99%
“…Also, it has to be noted that this hybrid system model was deployed in a tire manufacturing unit, and it produced efficient results in automatically diagnosing the bubble-defects in treads and sidewalls of tires. In the future work, more advanced CNN enabled approaches can be implemented for automated detection of defects [26][27][28][29][30], thus ensuring and realizing a sustainable tire manufacturing process.…”
Section: Discussionmentioning
confidence: 99%
“…It also has low hardware utilization which results in low throughput per PE. [8] proposes an FPGA-based CNN accelerator with integrated depth-wise separable mode of operation. This accelerator, however, has low throughput because of the usage of 32-bit floating point format.…”
Section: Related Workmentioning
confidence: 99%