2019 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2019
DOI: 10.1109/ipdpsw.2019.00117
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Dynamic Loop Self-Scheduling on Distributed-Memory Systems Using an MPI+MPI Approach

Abstract: Computationally-intensive loops are the primary source of parallelism in scientific applications. Such loops are often irregular and a balanced execution of their loop iterations is critical for achieving high performance. However, several factors may lead to an imbalanced load execution, such as problem characteristics, algorithmic, and systemic variations. Dynamic loop self-scheduling (DLS) techniques are devised to mitigate these factors, and consequently, improve application performance. On distributed-mem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 35 publications
0
3
0
Order By: Relevance
“…(1) Separation between concepts and implementations: the DCA [11] and its hierarchical version [12] were motivated by the new advancements in the MPI 3.1 standard, namely MPI one-sided communication and MPI shared-memory. The following question arises: Is DCA limited to specific MPI features?…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…(1) Separation between concepts and implementations: the DCA [11] and its hierarchical version [12] were motivated by the new advancements in the MPI 3.1 standard, namely MPI one-sided communication and MPI shared-memory. The following question arises: Is DCA limited to specific MPI features?…”
Section: Introductionmentioning
confidence: 99%
“…We highlight specific requirements that a DLS technique needs to fulfill to separate chunk calculation that can be distributed across all PEs and the chunk assignment that should be synchronized across all PEs. In contrast to earlier efforts [11,12], we introduce and evaluate a two-sided MPI-based implementation of DCA. This implementation applies to all existing MPI runtime libraries because they fully support two-sided MPI communication.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation