2018
DOI: 10.1007/978-3-319-98521-3_1
|View full text |Cite
|
Sign up to set email alerts
|

The Impact of Taskyield on the Design of Tasks Communicating Through MPI

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(22 citation statements)
references
References 6 publications
0
19
0
Order By: Relevance
“…It is also notable that the OmpSs-2 variants yield by far the highest performance, even though their task-structure is similar to that using OpenMP detached tasks. We attribute this to potential degradation of parallelism in Clang's libomp if tasks waiting in taskwait constructs (required to wait for nested tasks) are stacked up on the same thread, similar to issues reported for the taskyield directive [50]. 13 The usage of MPI Continuations with OmpSs-2 provides a slight speedup of about 1.5% over TAMPI for class D, suggesting a more efficient communication management.…”
Section: Npb Bt-mzmentioning
confidence: 53%
See 1 more Smart Citation
“…It is also notable that the OmpSs-2 variants yield by far the highest performance, even though their task-structure is similar to that using OpenMP detached tasks. We attribute this to potential degradation of parallelism in Clang's libomp if tasks waiting in taskwait constructs (required to wait for nested tasks) are stacked up on the same thread, similar to issues reported for the taskyield directive [50]. 13 The usage of MPI Continuations with OmpSs-2 provides a slight speedup of about 1.5% over TAMPI for class D, suggesting a more efficient communication management.…”
Section: Npb Bt-mzmentioning
confidence: 53%
“…With node-local programming models such as OpenMP, this tracking of active communication operations is left to the application layer. Unfortunately, the simple approach of test-yield cycles inside a task does not provide optimal performance due to CPU cycles being wasted on testing and (in the case of OpenMP) may not even be portable [50].…”
mentioning
confidence: 99%
“…Based on OpenMP 4.5, Schuchart et al [18] suggested to use an OpenMP taskyield based approach as sketched in Source Code 2. In the worst case, this approach can result in deadlock due to the limited guarantees OpenMP provides for taskyield.…”
Section: Interoperability With Openmp Tasksmentioning
confidence: 99%
“…Code. Schuchart et al [18] provide two MPI-distributed and taskified versions of Blocked Cholesky Factorization. One version funnels all MPI communication through the master thread (singlecom); the other version has fine-grained dependencies and performs nonblocking communication in tasks (taskyield).…”
Section: Blocked Cholesky Factorizationmentioning
confidence: 99%
See 1 more Smart Citation