2015
DOI: 10.1137/140991133
|View full text |Cite
|
Sign up to set email alerts
|

Multicore-Optimized Wavefront Diamond Blocking for Optimizing Stencil Updates

Abstract: The importance of stencil-based algorithms in computational science has focused attention on optimized parallel implementations for multilevel cache-based processors. Temporal blocking schemes leverage the large bandwidth and low latency of caches to accelerate stencil updates and approach theoretical peak performance. A key ingredient is the reduction of data traffic across slow data paths, especially the main memory interface. In this work we combine the ideas of multicore wavefront temporal blocking and dia… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
65
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 68 publications
(68 citation statements)
references
References 24 publications
1
65
0
Order By: Relevance
“…For example, seismic [63], stencil [64], [65], electromagnetic [66], molecular dynamics [67], Fast Multipole Methods [68], tensors [39], deep learning [69], [70], databases [49], [71], [72], big data [73], systems and graph engines [74], and many more.…”
Section: State-of-the-art Shared-memory Optimizationsmentioning
confidence: 99%
“…For example, seismic [63], stencil [64], [65], electromagnetic [66], molecular dynamics [67], Fast Multipole Methods [68], tensors [39], deep learning [69], [70], databases [49], [71], [72], big data [73], systems and graph engines [74], and many more.…”
Section: State-of-the-art Shared-memory Optimizationsmentioning
confidence: 99%
“…We use our cache blocking technique described in (Malas, Hager, Ltaief, Stengel, et al 2014). The implementation code is also available online (Malas 2015).…”
Section: Methodology: Multi-threaded Wavefront Diamond Blockingmentioning
confidence: 99%
“…The parameter search space is reduced by our analytical model, which predicts the largest diamond size that fits in a given cache size. (Malas, Hager, Ltaief, Stengel, et al 2014)…”
Section: Methodology: Multi-threaded Wavefront Diamond Blockingmentioning
confidence: 99%
See 1 more Smart Citation
“…The example of the latter is the wavefront diamond blocking method [16,17]. The advantage of DiamondTorre in comparison to these is an efficient use of the GPGPU architecture.…”
Section: Benefits Of the Lrnla Approachmentioning
confidence: 99%