2013
DOI: 10.1109/tmag.2013.2244861
|View full text |Cite
|
Sign up to set email alerts
|

Communication-Avoiding Krylov Techniques on Graphic Processing Units

Abstract: Communicating data within the graphic processing unit (GPU) memory system and between the CPU and GPU are major bottlenecks in accelerating Krylov solvers on GPUs. Communication-avoiding techniques reduce the communication cost of Krylov subspace methods by computing several vectors of a Krylov subspace "at once," using a kernel called "matrix powers." The matrix powers kernel is implemented on a recent generation of NVIDIA GPUs and speedups of up to 5.7 times are reported for the communication-avoiding matrix… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
7
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 12 publications
(12 reference statements)
0
7
0
Order By: Relevance
“…A number of libraries and inspector-executor frameworks provide parallel implementations of fused sparse kernels with no loop-carried dependencies such as, two or more SpMV kernels [2,21,31,34,40] or SpMV and dot products [1,2,13,15,40,61]. The formulation of đť‘ -step Krylov solvers [6] has enabled iterations of iterative solvers to be interleaved and hence multiple SpMV kernels are optimized simultaneously via replicating computations to minimize communication costs [21,31,34,45].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…A number of libraries and inspector-executor frameworks provide parallel implementations of fused sparse kernels with no loop-carried dependencies such as, two or more SpMV kernels [2,21,31,34,40] or SpMV and dot products [1,2,13,15,40,61]. The formulation of đť‘ -step Krylov solvers [6] has enabled iterations of iterative solvers to be interleaved and hence multiple SpMV kernels are optimized simultaneously via replicating computations to minimize communication costs [21,31,34,45].…”
Section: Related Workmentioning
confidence: 99%
“…A number of libraries and inspector-executor frameworks provide parallel implementations of fused sparse kernels with no loop-carried dependencies such as, two or more SpMV kernels [2,21,31,34,40] or SpMV and dot products [1,2,13,15,40,61]. The formulation of đť‘ -step Krylov solvers [6] has enabled iterations of iterative solvers to be interleaved and hence multiple SpMV kernels are optimized simultaneously via replicating computations to minimize communication costs [21,31,34,45]. Sparse tiling [26,[47][48][49]51] is an inspector executor approach that uses manually written inspectors [47,49] to group iteration of different loops of a specific kernel such as Gauss-Seidel [49] and Moldyn [47] and is generalized for parallel loops without loop-carried dependencies [26,51].…”
Section: Related Workmentioning
confidence: 99%
“…Most of these work focus on accelerating the execution of the compute intensive kernels in the Krylov solvers. As shown in [4,56] such efforts are mainly communication bound and are limited by the maximum performance achieved from parallelizing the compute intensive kernels. The assembled matrix is usually large and sparse and in many cases does not fit in the small and fast access memory levels of CPU and the co-processor memory.…”
Section: Introductionmentioning
confidence: 99%
“…This motivated research into the use of other more well-conditioned bases for the Krylov subspaces, including scaled monomial bases [26], Chebyshev bases [29,16,17], and Newton bases [1,20]. 3 The growing cost of communication in large-scale sparse problems has created a recent resurgence of interest in the implementation, optimization, and development of s-step Krylov subspace methods; see, e.g., the recent works [18,27,35,46,23,34,44,45,43].…”
mentioning
confidence: 99%
“…The growing cost of communication in large-scale sparse problems has created a recent resurgence of interest in the implementation, optimization, and development of s-step Krylov subspace methods; see, e.g., the recent works [18,27,35,46,23,34,44,45,43].…”
mentioning
confidence: 99%