2017 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2017
DOI: 10.1109/ipdpsw.2017.155
|View full text |Cite
|
Sign up to set email alerts
|

Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 27 publications
(15 citation statements)
references
References 33 publications
0
15
0
Order By: Relevance
“…A great deal of software solutions has been published on accelerating sparse algebra kernels, mostly for SpMV [4], [9]- [11], [24], [25], [29], [36], [50], [53], [63], [64], but also for SpMM [35], [67], [70]. Most of these works are based on format and data transformations, where block-based sparse matrix representations have received most attention for two main reasons: 1) sparse matrices in real applications generally have a block sub-structure, and 2) on-chip memory requests may be decreased when using block relative indices instead of directly using row/column ones.…”
Section: Related Workmentioning
confidence: 99%
“…A great deal of software solutions has been published on accelerating sparse algebra kernels, mostly for SpMV [4], [9]- [11], [24], [25], [29], [36], [50], [53], [63], [64], but also for SpMM [35], [67], [70]. Most of these works are based on format and data transformations, where block-based sparse matrix representations have received most attention for two main reasons: 1) sparse matrices in real applications generally have a block sub-structure, and 2) on-chip memory requests may be decreased when using block relative indices instead of directly using row/column ones.…”
Section: Related Workmentioning
confidence: 99%
“…Hou et al [81] proposed an auto-tuning framework for AMD APU platforms to find appropriate binning scheme and select appropriate kernel for each bin. The process of grouping rows with similar number of nonzeros together is referred to as binning by the authors.…”
Section: Literature Surveymentioning
confidence: 99%
“…It was shown that this approach appears to be the least efficient [14,15]. This follows from the overhead due to the sparse matrix format, from non-regular memory access, from a very low flop-to-byte ratio [21,22], and from problems concerning load imbalance [23]. Since SpMV is a memory-bound procedure, performance optimizations do not overcome the issue of considerable memory consumption.…”
Section: Ritz-galerkin Formulationmentioning
confidence: 99%