2014 IEEE 28th International Parallel and Distributed Processing Symposium 2014
DOI: 10.1109/ipdps.2014.48
|View full text |Cite
|
Sign up to set email alerts
|

Improving the Performance of CA-GMRES on Multicores with Multiple GPUs

Abstract: The Generalized Minimum Residual (GMRES) method is one of the most widely-used iterative methods for solving nonsymmetric linear systems of equations. In recent years, techniques to avoid communication in GMRES have gained attention because in comparison to floating-point operations, communication is becoming increasingly expensive on modern computers. Since graphics processing units (GPUs) are now becoming crucial component in computing, we investigate the effectiveness of these techniques on multicore CPUs w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 45 publications
(35 citation statements)
references
References 18 publications
0
35
0
Order By: Relevance
“…However, SVQR still requires computing the Gram matrix, leading to the same normwise upper-bound on the orthogonality error as CholQR. In this paper, we focus on CholQR since its implementation is simpler than that of SVQR, and in our previous studies, we did not identify a test case where CA-GMRES converges with SVQR but not with CholQR [25]. Nevertheless, most of the numerical analysis for CholQR can be trivially extended to SVQR [21].…”
Section: Communication-avoiding Qr (Caqr)mentioning
confidence: 99%
See 4 more Smart Citations
“…However, SVQR still requires computing the Gram matrix, leading to the same normwise upper-bound on the orthogonality error as CholQR. In this paper, we focus on CholQR since its implementation is simpler than that of SVQR, and in our previous studies, we did not identify a test case where CA-GMRES converges with SVQR but not with CholQR [25]. Nevertheless, most of the numerical analysis for CholQR can be trivially extended to SVQR [21].…”
Section: Communication-avoiding Qr (Caqr)mentioning
confidence: 99%
“…In addition, Figure 4 shows their performance on up to three NVIDIA Tesla M2090 GPUs. A more detailed description of our implementations and the performance of these standard orthogonalization schemes can be found in [25]. 3.…”
Section: Communication-avoiding Qr (Caqr)mentioning
confidence: 99%
See 3 more Smart Citations