2017
DOI: 10.3997/2214-4609.201702324
|View full text |Cite
|
Sign up to set email alerts
|

High-Performance Seismic Modeling with Finite-Difference Using Spatial and Temporal Cache Blocking

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…To increase data locality, TB through tiling techniques (Bandishti et al, 2012;Grosser et al, 2014b;Malas et al, 2015;Orozco and Gao, 2009;Strzodka et al, 2011;Wellein et al, 2009;Wonnacott, 2000;Yuan et al, 2017;Zhou, 2013) has been widely considered using various advanced programming models to favor asynchronous execution. Performance tuning using roofline models (Datta, 2009;Etienne et al, 2017;Nguyen et al, 2010;Titarenko and Hildyard, 2017) remains an important assessment step for stencil computations to ensure a good utilization of the underlying hardware resources. Some of these efforts have translated into software releases (e.g.…”
Section: Prior Work and Current Contributionsmentioning
confidence: 99%
“…To increase data locality, TB through tiling techniques (Bandishti et al, 2012;Grosser et al, 2014b;Malas et al, 2015;Orozco and Gao, 2009;Strzodka et al, 2011;Wellein et al, 2009;Wonnacott, 2000;Yuan et al, 2017;Zhou, 2013) has been widely considered using various advanced programming models to favor asynchronous execution. Performance tuning using roofline models (Datta, 2009;Etienne et al, 2017;Nguyen et al, 2010;Titarenko and Hildyard, 2017) remains an important assessment step for stencil computations to ensure a good utilization of the underlying hardware resources. Some of these efforts have translated into software releases (e.g.…”
Section: Prior Work and Current Contributionsmentioning
confidence: 99%
“…Finding the optimal block size is not straightforward since it depends on the FD scheme stencil, the model size and the underlying hardware (e.g., number of threads and cache size). We follow a simple approach proposed by Etienne et al (2017), to find the optimal setting. We perform a series of computations where the block size is parameterized and different for each computation.…”
Section: Spatial Blockingmentioning
confidence: 99%
“…We assume it is better not to divide the domain along z, which corresponds to the inner index in order to keep an efficient hardware vectorization through Intel AVX instructions. We perform the test using a 512x512x512 grid points without CPML and determine that the optimal block size is 16 points along x, and 5 points along y (Etienne et al 2017). These results are obtained using 32 OpenMP threads on one computing node of the supercomputer Shaheen II at KAUST, based on a dual-socket 16-core Intel Haswell CPU.…”
Section: Spatial Blockingmentioning
confidence: 99%
See 1 more Smart Citation
“…We investigate the extension of Multicore-optimized Wavefront Diamond Temporal Blocking (MWD-TB) approach (Malas et al, 2015(Malas et al, , 2017 -already introduced for the second-order differential formulation (Etienne et al, 2017) -to the first order formulation. This is achieved by doubling the total number of time iterations and performing an update of the pressure wavefield and the velocity wavefield alternately ( Figure 3).…”
Section: Temporal Cache Blockingmentioning
confidence: 99%