2015
DOI: 10.1145/2842615
|View full text |Cite
|
Sign up to set email alerts
|

On How to Accelerate Iterative Stencil Loops

Abstract: In high-performance systems, stencil computations play a crucial role as they appear in a variety of different fields of application, ranging from partial differential equation solving, to computer simulation of particles' interaction, to image processing and computer vision. The computationally intensive nature of those algorithms created the need for solutions to efficiently implement them in order to save both execution time and energy. This, in combination with their regular structure, has justified their … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 24 publications
(19 citation statements)
references
References 47 publications
(48 reference statements)
0
19
0
Order By: Relevance
“…Previous work [1,9,20,22] have shown that FPGAs can achieve GPU-level performance in stencil computation. Most of such work achieve this level of performance by relying on temporal blocking without spatial blocking.…”
Section: Introductionmentioning
confidence: 99%
“…Previous work [1,9,20,22] have shown that FPGAs can achieve GPU-level performance in stencil computation. Most of such work achieve this level of performance by relying on temporal blocking without spatial blocking.…”
Section: Introductionmentioning
confidence: 99%
“…In this section, we provide an example that illustrates why the DCMI acceleration strategy is superior to current state-of-the-art FPGA-based accelerators [4,40]. We first explain the operation of the stencil compute kernel in Section 2.1.…”
Section: Explaining the Efficiency Of Dcmimentioning
confidence: 99%
“…The importance and high potential for parallelism makes ISL-applications an interesting acceleration target. However, developing efficient ISL accelerators is challenging for two reasons [4]. First, ISL-applications operate on a large data array in which all data elements are typically updated in each iteration of the outer-loop.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…CAOS currently supports three different architectural templates: SST (Single Stencil Time-Step), Master-Slave, and Dataflow. SST provides an architecture targeted for stencil codes written in C [21]. Within this context, CAOS offers a design exploration algorithm [22] that jointly maximizes the number of SST processors that can be instantiated on the target FPGA, and identifies a floorplan of the design that minimizes the inter-component wire-length in order to allow implementing the system at a higher frequency.…”
Section: The Extra Open Research Platformmentioning
confidence: 99%