2020
DOI: 10.1109/access.2020.3025550
|View full text |Cite
|
Sign up to set email alerts
|

Roofline-Model-Based Design Space Exploration for Dataflow Techniques of CNN Accelerators

Abstract: To effectively compute convolutional layers, a complex design space must exist (e.g., the dataflow techniques associated with the layer parameters, loop transformation techniques, and hardware parameters). For efficient design space exploration (DSE) of various dataflow techniques, namely, the weight-stationary (WS), output-stationary (OS), row-stationary (RS), and no local reuse (NLR) techniques, the processing element (PE) structure and computational pattern of each dataflow technique are analyzed. Various p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
8
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(11 citation statements)
references
References 29 publications
(64 reference statements)
0
8
0
Order By: Relevance
“…These results are estimated using Timeloop[30] and Accelergy[37] tools. Our data show that our event-driven technique consumes less energy than alternate data flows such as weight, output, and input stationary[32,35]. The specifications of three different layers studied are described in Table1.…”
mentioning
confidence: 89%
See 2 more Smart Citations
“…These results are estimated using Timeloop[30] and Accelergy[37] tools. Our data show that our event-driven technique consumes less energy than alternate data flows such as weight, output, and input stationary[32,35]. The specifications of three different layers studied are described in Table1.…”
mentioning
confidence: 89%
“…However, these deep networks [14,21,25,33,36] can have multiple hidden layers, millions of parameters, billions of operations and require tremendous storage and intense computation resources, making it difficult to realize energy-efficient and high-performance solutions. To address this issue, several model compression techniques [10,17,18,20,26,38], efficient dataflow techniques [32,35], and dataflow accelerators [1,3,5,7,8,15,16,19,28,31,39,40,41,42,43] have been proposed and widely investigated in recent years. * These authors contributed equally to this work.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In 2016, Motamedi et al adopted the design space exploration method to explore the optimal design parameters [19]. In this study, the roofline model [11] is to explore the best performance on a certain hardware platform, and the optimal values of Tn, Tm, Tr, and Tc are selected. The roofline model is presented in Figure 9.…”
Section: Design Space Explorationmentioning
confidence: 99%
“…Lu et al [10] introduced an efficient sparse Winograd method to accelerate CNN on FPGA. Besides, the roofline model is adopted to explore the efficient calculation of convolutional layer [11], and performance modelling is designed for CNN inference accelerators [12].…”
Section: Introductionmentioning
confidence: 99%