2020 IEEE/ACM 10th Workshop on Fault Tolerance for HPC at eXtreme Scale (FTXS) 2020
DOI: 10.1109/ftxs51974.2020.00010
|View full text |Cite
|
Sign up to set email alerts
|

A Generic Strategy for Node-Failure Resilience for Certain Iterative Linear Algebra Methods

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…When the full state of the failed process is reconstructed, the computation can proceed on the replacement node. There is a generic strategy for identifying the state of an iterative linear algebra solver and for reconstructing the state upon recovery [14]. However, like prior work [16,17], we focus on the preconditioned conjugate gradient (PCG) solver, which solves the linear equation 𝐴π‘₯ = 𝑏 for a symmetric positive definite matrix 𝐴 𝑛×𝑛 (see Algorithm 1).…”
Section: In-memory Esr and Its Challengesmentioning
confidence: 99%
See 2 more Smart Citations
“…When the full state of the failed process is reconstructed, the computation can proceed on the replacement node. There is a generic strategy for identifying the state of an iterative linear algebra solver and for reconstructing the state upon recovery [14]. However, like prior work [16,17], we focus on the preconditioned conjugate gradient (PCG) solver, which solves the linear equation 𝐴π‘₯ = 𝑏 for a symmetric positive definite matrix 𝐴 𝑛×𝑛 (see Algorithm 1).…”
Section: In-memory Esr and Its Challengesmentioning
confidence: 99%
“…These variables should be chosen such that all other significant variables can be reconstructed from their values. A generic method for this state identification for iterative solvers is described in [14]. It is also possible to take advantage of concurrent data distributed between nodes to reconstruct the state [16].…”
Section: In-memory Esr and Its Challengesmentioning
confidence: 99%
See 1 more Smart Citation