2018
DOI: 10.1177/1094342018817088
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive control in roll-forward recovery for extreme scale multigrid

Abstract: With the increasing number of compute components, failures in future exa-scale computer systems are expected to become more frequent. This motivates the study of novel resilience techniques. Here, we extend a recently proposed algorithm-based recovery method for multigrid iterations by introducing an adaptive control. After a fault, the healthy part of the system continues the iterative solution process, while the solution in the faulty domain is re-constructed by an asynchronous on-line recovery. The computat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 76 publications
0
2
0
Order By: Relevance
“…In combination with the tearing and intersection approach for the recovery, it results in a hybrid approach. In case of a Stokes-type system, yielding after discretization a saddle point problem, the strategy can either be applied on the positive definite Schur complement for the pressure or, as it was done in Huber et al (2019), on the indefinite velocity-pressure system. In that case, an all-at-once multigrid method with an Uzawa-type smoother acting on both solution components turns out to be most efficient (see Drzisga et al, 2018).…”
Section: Numerical Algorithms For Resiliencementioning
confidence: 99%
See 1 more Smart Citation
“…In combination with the tearing and intersection approach for the recovery, it results in a hybrid approach. In case of a Stokes-type system, yielding after discretization a saddle point problem, the strategy can either be applied on the positive definite Schur complement for the pressure or, as it was done in Huber et al (2019), on the indefinite velocity-pressure system. In that case, an all-at-once multigrid method with an Uzawa-type smoother acting on both solution components turns out to be most efficient (see Drzisga et al, 2018).…”
Section: Numerical Algorithms For Resiliencementioning
confidence: 99%
“…In that case, an all-at-once multigrid method with an Uzawa-type smoother acting on both solution components turns out to be most efficient (see Drzisga et al, 2018). Numerical and algorithmic studies including multiple faults and large-scale problems with more than 5 ⋅ 10 11 degrees of freedom and more than 245,000 cores have been demonstrated (Huber et al, 2016, 2019). The automatic re-coupling strategy is found to be robust with respect to the fault location and size and also handling multiple fault.…”
Section: Numerical Algorithms For Resiliencementioning
confidence: 99%