2019 29th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS) 2019
DOI: 10.1109/patmos.2019.8862166
|View full text |Cite
|
Sign up to set email alerts
|

High Performance Accelerator for CNN Applications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(8 citation statements)
references
References 7 publications
0
5
0
Order By: Relevance
“…On the other hand, studies on CNN hardware acceleration architecture designs are increasing vigorously [29,[35][36][37]. Parallel processing engine (PE) and pipelined architecture are widely used in CNN accelerators to improve bandwidth and reduce latency.…”
Section: Hardware Acceleration For Cnnsmentioning
confidence: 99%
See 1 more Smart Citation
“…On the other hand, studies on CNN hardware acceleration architecture designs are increasing vigorously [29,[35][36][37]. Parallel processing engine (PE) and pipelined architecture are widely used in CNN accelerators to improve bandwidth and reduce latency.…”
Section: Hardware Acceleration For Cnnsmentioning
confidence: 99%
“…However, the PE array is not fully utilized during acceleration because of an inappropriate 12 × 14 array size. Kyriakos et al [36] used a highly pipelined structure in their architecture with each computation operation as a stage. This structure reduces the access to off-chip DRAM, thereby achieving low latency and low power consumption.…”
Section: Hardware Acceleration For Cnnsmentioning
confidence: 99%
“…As a case in point, in some studies such as [38,46], the accuracy was improved to a large extent in AV use cases; however, the detection was not in real-time. In [23,26,51], the PGA accelerator was used to improve the speed which is a costly solution. Besides, in [26], the improvement in execution speed (by 11.5%) was compared against the execution on CPU which is usually the slowest hardware to execute CNNs.…”
Section: Convolutional Neural Network Improvementsmentioning
confidence: 99%
“…FPGAs can over-come these limitations due to their custom parallel processing capabilities and flexibility. Various CNN implementations on FPGA have been reported in literature [11]- [13], focusing on different aspects, e.g., the optimization of only the convolutional layers [14], [15] or the overall accelerator throughput [16]. There are also SW/HW co-design solutions that exploit the aggregate power of both an embedded processor and the programmable logic [17], [18].…”
Section: Background and Literature Reviewmentioning
confidence: 99%