2015
DOI: 10.1016/j.parco.2015.04.004
|View full text |Cite
|
Sign up to set email alerts
|

Speculative segmented sum for sparse matrix-vector multiplication on heterogeneous processors

Abstract: Sparse matrix-vector multiplication (SpMV) is a central building block for scientific software and graph applications. Recently, heterogeneous processors composed of different types of cores attracted much attention because of their flexible core configuration and high energy efficiency. In this paper, we propose a compressed sparse row (CSR) format based SpMV algorithm utilizing both types of cores in a CPU-GPU heterogeneous processor. We first speculatively execute segmented sum operations on the GPU part of… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
35
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
3
2

Relationship

5
4

Authors

Journals

citations
Cited by 54 publications
(35 citation statements)
references
References 42 publications
0
35
0
Order By: Relevance
“…Compared to SpGEMM, other sparse matrix multiplication operations (e.g., multiplication of sparse matrix and dense matrix [10,11,25] and its special case sparse matrix-vector multiplication [7,8,9,26,27]) pre-allocate a dense resulting matrix or vector. Thus the size of the result of the multiplication is trivially predictable, and the corresponding entries are stored to predictable memory addresses.…”
Section: Memory Pre-allocation For the Resulting Matrixmentioning
confidence: 99%
“…Compared to SpGEMM, other sparse matrix multiplication operations (e.g., multiplication of sparse matrix and dense matrix [10,11,25] and its special case sparse matrix-vector multiplication [7,8,9,26,27]) pre-allocate a dense resulting matrix or vector. Thus the size of the result of the multiplication is trivially predictable, and the corresponding entries are stored to predictable memory addresses.…”
Section: Memory Pre-allocation For the Resulting Matrixmentioning
confidence: 99%
“…The number of nonzero elements of the matrices ranges from 100K to 200M [21]. The dataset includes both regular and irregular matrices, covering domains from scientific computing to social networks [24]. 0 2 4 6 8 10 12 14 Speedups over NUMA-unaware(x) Fig.…”
Section: Evaluation Setupmentioning
confidence: 99%
“…These work include designing new programming language constructs [26] and developing parallelization frameworks [5,7,29,30,35]. Some of these studies have explored parallelism in irregular programs [9,15,20,25,31], which share some similarities with the parallelization of FSM computations, given that FSMs essentially run on an irregular data structure (a graph). Quinones and others [28] use pre-computation for speculative threading, which shares the idea with speculative FSM parallelization in that both exploit some constraints of the computation to facilitate speculative execution.…”
Section: Related Workmentioning
confidence: 99%