2009 20th IEEE International Conference on Application-Specific Systems, Architectures and Processors 2009
DOI: 10.1109/asap.2009.25
|View full text |Cite
|
Sign up to set email alerts
|

A Massively Parallel Coprocessor for Convolutional Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
94
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 201 publications
(96 citation statements)
references
References 11 publications
1
94
0
Order By: Relevance
“…25) can be used to classify the DNN dataflows in recent works [82][83][84][85][86][87][88][89][90][91][92][93] based on their data handling characteristics [80]:…”
Section: B Energy-efficient Dataflow For Acceleratorsmentioning
confidence: 99%
“…25) can be used to classify the DNN dataflows in recent works [82][83][84][85][86][87][88][89][90][91][92][93] based on their data handling characteristics [80]:…”
Section: B Energy-efficient Dataflow For Acceleratorsmentioning
confidence: 99%
“…[1,2,3,4] Due to the specific computation pattern of CNN, general purpose processors hardly meet the implementation requirement, which encourages the proposal of various hardware implementations based on FPGA, GPU and ASIC [5,6,7]. CNN contains numerous 2D convolutions, which are responsible for more than 90% of the whole computation [8].…”
Section: Introductionmentioning
confidence: 99%
“…To solve this problem, many efforts have been made [1,4,9,10,11]. Among these approaches, the architecture which is inspired by [12], first introduced into CNN by [1], is commonly adopted.…”
Section: Introductionmentioning
confidence: 99%
“…Other recent works propose different CNN acceleration hardware. For example, [3,[10][11][12]22] focus on 2D-convolvers, which play the roles of both compute modules and data caches. Meanwhile, [18,19] use FMA units for computation.…”
Section: Related Workmentioning
confidence: 99%
“…Several key similarities cause these methods to suffer from the underutilization problem we observe in our Single-CLP design. For example, the 2D-convolvers used in [3,10,12,22] must be provisioned for the largest filter across layers; they will necessarily be underutilized when computing layers with smaller filters. In [19], the organization of the compute modules depends on the number of output feature maps and their number of rows.…”
Section: Related Workmentioning
confidence: 99%