2019
DOI: 10.1080/10618562.2019.1617856
|View full text |Cite
|
Sign up to set email alerts
|

MPI+X: task-based parallelisation and dynamic load balance of finite element assembly

Abstract: The main computing tasks of a finite element code(FE) for solving partial differential equations (PDE's) are the algebraic system assembly and the iterative solver. This work focuses on the first task, in the context of a hybrid MPI+X paradigm. Although we will describe algorithms in the FE context, a similar strategy can be straightforwardly applied to other discretization methods, like the finite volume method. The matrix assembly consists of a loop over the elements of the MPI partition to compute element m… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 15 publications
(14 citation statements)
references
References 30 publications
0
13
0
1
Order By: Relevance
“…But the lowest value of load balance appears in the computation of particles: L 96 = 0.02 means that globally 98% of the time of that phase is wasted. For a complete analysis of load unbalance in Alya, see Garcia-Gasulla et al (2018a).…”
Section: The Computational Challengementioning
confidence: 99%
See 1 more Smart Citation
“…But the lowest value of load balance appears in the computation of particles: L 96 = 0.02 means that globally 98% of the time of that phase is wasted. For a complete analysis of load unbalance in Alya, see Garcia-Gasulla et al (2018a).…”
Section: The Computational Challengementioning
confidence: 99%
“…Within the code of Alya, we already tested runtime mechanisms to mitigate load imbalance penalties on an Intel-based HPC cluster (Garcia-Gasulla et al, 2018a). This work is an extension of our previous work (Garcia-Gasulla et al, 2018b).…”
Section: Introduction and Related Workmentioning
confidence: 99%
“…But the lowest value of load balance appears in the computation of particles: L 96 = 0.02 means that globally 98% of the time of that phase is wasted. For a complete analysis of load unbalance in Alya see [12].…”
Section: Profile and Performance Analysismentioning
confidence: 99%
“…Synchronous n = f + p n = f' + p' Figure 3: Execution modes for CFPD simulations with Alya a set of local operations is performed in order to assemble the local matrices. More details can be found in [12,14]. From the parallelism point of view this phase has two important characteristics:…”
Section: Stepmentioning
confidence: 99%
See 1 more Smart Citation