2016 International Conference on High Performance Computing &Amp; Simulation (HPCS) 2016
DOI: 10.1109/hpcsim.2016.7568426
|View full text |Cite
|
Sign up to set email alerts
|

A cache-aware approach to domain decomposition for stencil-based codes

Abstract: Abstract-Partial Differential Equations (PDEs) lie at the heart of numerous scientific simulations depicting physical phenomena. The parallelization of such simulations introduces additional performance penalties in the form of local and global synchronization among cooperating processes. Domain decomposition partitions the largest shareable data structures into sub-domains and attempts to achieve perfect load balance and minimal communication. Up to now research efforts to optimize spatial and temporal cache … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
2
1
1

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(19 citation statements)
references
References 18 publications
0
19
0
Order By: Relevance
“…We investigate the optimality of partitions returned by MPI_Dims_create() and whether only minimizing communication is sufficient to obtain optimal sub‐domain dimensions for parallel GMG. Our work in Saxena et al demonstrated the dependence of domain partitioning for single grids on cache‐misses in computation and communication. Since parallel GMG is significantly more complex than a single grid and incorporates further stencil operators, the current research examines the efficacy of extending the model to parallel GMG.…”
Section: Terminology and Problem Descriptionmentioning
confidence: 70%
See 4 more Smart Citations
“…We investigate the optimality of partitions returned by MPI_Dims_create() and whether only minimizing communication is sufficient to obtain optimal sub‐domain dimensions for parallel GMG. Our work in Saxena et al demonstrated the dependence of domain partitioning for single grids on cache‐misses in computation and communication. Since parallel GMG is significantly more complex than a single grid and incorporates further stencil operators, the current research examines the efficacy of extending the model to parallel GMG.…”
Section: Terminology and Problem Descriptionmentioning
confidence: 70%
“…However, in some applications due to, e.g., varying coefficients, nonlinearities, or more complex grids, this may not be the case. Since data is mapped out linearly in memory regardless of the dimension of an array data structure, the non‐contiguous access pattern produced by stencils increases the cache‐misses . Efforts have been made to optimize and exploit spatial and temporal principles of the cache memory hierarchy to bridge the gap between the fast processor speed and the comparatively slower memory access times.…”
Section: Background and Related Workmentioning
confidence: 99%
See 3 more Smart Citations