Optimization of sparse matrix–vector multiplication using reordering techniques on GPUs

Pichel, Juan C.; Rivera, Francisco F.; Fernández, Marcos; Rodríguez, Aurelio

doi:10.1016/j.micpro.2011.05.005

Cited by 46 publications

(26 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Nishtala et al [48] designed a high-level data partitioning method for SpMV to achieve better cache locality on multicore CPUs. Pichel et al [49] evaluated how reordering techniques influence performance of SpMV on GPUs. Baskaran and Bordawekar [50] improved off-chip and on-chip memory access patterns of SpMV on GPUs.…”

Section: Comparison To Related Methodsmentioning

confidence: 99%

Speculative segmented sum for sparse matrix-vector multiplication on heterogeneous processors

Liu

Vinter

2015

Parallel Computing

View full text Add to dashboard Cite

Sparse matrix-vector multiplication (SpMV) is a central building block for scientific software and graph applications. Recently, heterogeneous processors composed of different types of cores attracted much attention because of their flexible core configuration and high energy efficiency. In this paper, we propose a compressed sparse row (CSR) format based SpMV algorithm utilizing both types of cores in a CPU-GPU heterogeneous processor. We first speculatively execute segmented sum operations on the GPU part of a heterogeneous processor and generate a possibly incorrect results. Then the CPU part of the same chip is triggered to re-arrange the predicted partial sums for a correct resulting vector. On three heterogeneous processors from Intel, AMD and nVidia, using 20 sparse matrices as a benchmark suite, the experimental results show that our method obtains significant performance improvement over the best existing CSR-based SpMV algorithms.

show abstract

Section: Comparison To Related Methodsmentioning

confidence: 99%

Speculative segmented sum for sparse matrix-vector multiplication on heterogeneous processors

Liu

Vinter

2015

Parallel Computing

View full text Add to dashboard Cite

show abstract

“…Sparse matrices are matrices with a considerable number of zero value elements and the storage methods mainly developed to reduce storage space [8] by storing information only related to nonzero elements. Most of such methods used in optimizing data storage size, perform data processing, increasing computation performance etc in computer science [3][4] [5].…”

Section: Introductionmentioning

confidence: 99%

“…few applications need time efficient storage methods and few will need space efficient methods, storage methods are selected according to the purpose and applications [5][2] [6].…”

Section: Introductionmentioning

confidence: 99%

Sparse Matrix to Decimal Coding (SMDC) Algorithm

K¹,

Abideen²,

.³

2017

IJERA

View full text Add to dashboard Cite

We recently introduced a new method for Sparse matrix storage[1] which will considerably reduce the storage space by storing only nonzero elements along with the weight of each row(or column) and the number of rows(or column). This paper discusses two algorithms, SMDC Algorithm to convert a sparse matrix into decimal coding format and Reverse SMDC Algorithm to convert a decimally coded matrix back into the normal sparse matrix format. SMDC is a space optimized storage method for storing sparse matrices. It can store a sparse matrix with m rows and n columns and nnz nonzero elements, with smaller (m or n) + nnz +1 storage space, which is very much space efficient storage compared to most of the sparse matrix storage methods.

show abstract

“…Enhancements are implemented on graphics processing units (GPUs) to speed up the computation. Sparse matrix vector multiplication through reordering techniques has been explored with Tesla C1060 and Tesla M2050 [1]. The computation of isogeometric analysis stiffness matrix exhibits increased speed when implemented on GeForce GTX680 [2].…”

Section: Introductionmentioning

confidence: 99%