2013
DOI: 10.1007/978-3-642-38718-0_36
|View full text |Cite
|
Sign up to set email alerts
|

Auto-tuning the Matrix Powers Kernel with SEJITS

Abstract: Abstract. The matrix powers kernel, used in communication-avoiding Krylov subspace methods, requires runtime auto-tuning for best performance. We demonstrate how the SEJITS (Selective Embedded Just-InTime Specialization) approach can be used to deliver a high-performance and performance-portable implementation of the matrix powers kernel to application authors, while separating their high-level concerns from those of auto-tuner implementers involving low-level optimizations. The benefits of delivering this ker… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…Therefore, those schemes require either redundant computations (explicit schemes) and/or irregular accesses to the matrix entries with bookkeeping (implicit schemes), resulting in performance bottlenecks. In [17] a runtime auto-tuning was introduced for the MPK scheme described above to choose the appropriate parameters (e.g., explicit vs. implicit schemes) for a given matrix. This was generalized to various kernels like Jacobi and serial Gauss-Seidel iterative solvers and automated using a sparse tiling algorithm via the power of loop chain abstraction [18], [19].…”
mentioning
confidence: 99%
“…Therefore, those schemes require either redundant computations (explicit schemes) and/or irregular accesses to the matrix entries with bookkeeping (implicit schemes), resulting in performance bottlenecks. In [17] a runtime auto-tuning was introduced for the MPK scheme described above to choose the appropriate parameters (e.g., explicit vs. implicit schemes) for a given matrix. This was generalized to various kernels like Jacobi and serial Gauss-Seidel iterative solvers and automated using a sparse tiling algorithm via the power of loop chain abstraction [18], [19].…”
mentioning
confidence: 99%