Sparse Matrix-Vector multiplication (SpMV) is one of the most significant yet challenging issues in computational science area. It is a memory-bound application whose performance mostly depends on the input matrix and the underlying architecture. Many researchers have paid more attentions on exploring a variety of optimization techniques to SpMV. One of the most promising respects is how to adapt the storage format to satisfy the underlying architecture. Alterative storage formats can largely lessen memory pressure, however, the computational resources are often underutilized.Therefore, a new storage format, which is called Compressed Sparse Row with Segmented Interleave Combination (SIC), is proposed. Stemming from Compressed Sparse Row format (CSR), SIC format employs an interleave combination pattern that combines certain amount of CSR rows to form a new SIC row. In order to further improve performance, segmented processing is also brought in. According to the empirical data, we also develop an automatic SIC-based SpMV suitable for all the matrices. Experimental results show that our approach outperforms the NVIDIA CSR vector kernel, achieving up to 12.6× speedup. It also demonstrates a comparable performance with the Hybrid format, even with the highest 2.89× speedup.
SUMMARYThe challenge for Sparse Matrix-Vector multiplication (SpMV) performance is memory bandwidth, which mostly depends on input matrices and underlying computing platforms. To solve this challenge, many researchers have explored a variety of optimization techniques. One of the most promising aspects focuses on designing storage formats to represent sparse matrices. However, lots of prior storage formats cannot fully take advantage of the underlying computing platforms, resulting in unsatisfactory performance and large memory footprint. Therefore, a novel storage format, called Segmented Hybrid ELL + Compressed Sparse Row (CSR) (SHEC for short), is proposed to further improve the throughput and lessen memory footprint on Graphics Processing Unit (GPU). SHEC format employs an interleaved combination pattern, which combines certain amount of compressed rows to form a new SHEC row. Segmentation is brought in to balance load and reduce memory footprint. According to the empirical data, an automatic SHEC-based SpMV is developed to fit for all the matrices. Experimental results show that SHEC approach outperforms the best results of NVIDIA SpMV library and exhibits a comparable performance with state-of-the-art storage formats on the standard dataset.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.