2017 27th International Conference on Field Programmable Logic and Applications (FPL) 2017
DOI: 10.23919/fpl.2017.8056844
|View full text |Cite
|
Sign up to set email alerts
|

Flexible FPGA design for FDTD using OpenCL

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 19 publications
(14 citation statements)
references
References 7 publications
0
14
0
Order By: Relevance
“…We use shift registers as on-chip buffers to take advantage of the regular memory access pattern in stencil computation. This is a well-known optimization that is employed in many deep-pipeline [9,20,22]. This optimization is not applicable to CPUs and GPUs due to lack of hardware support for this storage type.…”
Section: Spatial Blocking On Fpgasmentioning
confidence: 99%
See 2 more Smart Citations
“…We use shift registers as on-chip buffers to take advantage of the regular memory access pattern in stencil computation. This is a well-known optimization that is employed in many deep-pipeline [9,20,22]. This optimization is not applicable to CPUs and GPUs due to lack of hardware support for this storage type.…”
Section: Spatial Blocking On Fpgasmentioning
confidence: 99%
“…We achieve this large performance advantage despite the fact that the Kintex-7 XC7Z045 FPGA they use has more DSPs and roughly half of the logic and Block RAM count of our Stratix V A7 FPGA. [1,9,20,22] present the recent high-performing deep-pipelined implementations of stencil computation on FPGAs, all of which avoid spatial blocking and hence, put hard limits on input dimensions relative to on-chip memory size. In contrast, we do employ spatial blocking to avoid such restrictions which limit usability in real-world HPC applications, and show that it is still possible to achieve high performance.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…We also employ temporal blocking to take advantage of the temporal locality of stencil computation by storing intermediate results of multiple iterations (time steps) on-chip, before finally writing them back to external memory. Unlike many previous studies on FPGAs [14][15][16][17], combining spatial and temporal blocking allows us to achieve high performance without restricting input size.…”
Section: A Base Implementation For First-order Stencilsmentioning
confidence: 99%
“…Examples of the latter are MD implementations on graphics processing units (GPUs) [Abraham et al 2015;Anderson et al 2008;Brown et al 2012;Colberg and Höfling 2011;Eastman and Pande 2010;Le Grand et al 2013;Stone et al 2010], fieldprogrammable gate arrays (FPGAs) [Herbordt et al 2008a,b], and application-specific integrated circuits (ASICs) [Shaw et al 2007[Shaw et al , 2014. While the use of GPUs for scientific applications is relatively widespread [Owens et al 2008;Preis et al 2009;Weigel 2012], the use of ASICs [Boyle et al 2005;Brown and Christ 1988;Fukushige et al 1999; and FPGAs is less common [Baity-Jesi et al 2014;Belletti et al 2009;Giefers et al 2014;Kenter et al 2017Kenter et al , 2018Meyer et al 2012], but gained attention over the last years. In general, to maximize the computational power for a given silicon area, or equivalently minimize the power-consumption per arithmetic operation, more and more computing units are replaced with lower-precision units.…”
Section: Introductionmentioning
confidence: 99%