2012
DOI: 10.1007/978-3-642-33078-0_31
|View full text |Cite
|
Sign up to set email alerts
|

The Impact of Global Communication Latency at Extreme Scales on Krylov Methods

Abstract: Abstract. Krylov Subspace Methods (KSMs) are popular numerical tools for solving large linear systems of equations. We consider their role in solving sparse systems on future massively parallel distributed memory machines, by estimating future performance of their constituent operations. To this end we construct a model that is simple, but which takes topology and network acceleration into account as they are important considerations. We show that, as the number of nodes of a parallel machine increases to very… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 12 publications
(16 reference statements)
0
7
0
Order By: Relevance
“…Convergence is nearly identical to that of standard GMRES, except for a small delay. In Figure 5.1, bottom, the Newton basis is used with the zeros of the th order scaled and shifted (to [1,2]) Chebyshev polynomial as shifts, again in Leja ordering. Convergence is similar to standard GMRES.…”
Section: Numerical Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Convergence is nearly identical to that of standard GMRES, except for a small delay. In Figure 5.1, bottom, the Newton basis is used with the zeros of the th order scaled and shifted (to [1,2]) Chebyshev polynomial as shifts, again in Leja ordering. Convergence is similar to standard GMRES.…”
Section: Numerical Resultsmentioning
confidence: 99%
“…As the number of nodes increases, the global reductions required in lines 4 and 6 may well become the bottleneck [2,14,5] …”
Section: Standard Gmresmentioning
confidence: 99%
See 1 more Smart Citation
“…This resulted in speedups and improved scalability on distributed-memory machines. An analogous pipelined version of CG is presented in [18], and the pipelining approach is discussed further in [19]. Another pipelined algorithm, currently implemented in the SLEPc library [20], is the Arnoldi method with delayed reorthogonalization (ADR) [21].…”
Section: Related Workmentioning
confidence: 99%
“…This inner product must be completed before p k can be formed, and at least part of p k must be completed before the start of the next iteration computing Ap k . It has been observed that waiting for the two inner products to complete can be very costly when using large numbers of processors [1,5].…”
mentioning
confidence: 99%