A New Diagonal Blocking Format and Model of Cache Behavior for Sparse Matrices

Tvrdík, Pavel; Šimeček, Ivan

doi:10.1007/11752578_21

Cited by 16 publications

(9 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…• Register blocking formats (like SPARSITY [12] or [13] or CARB [5], [14]) store a matrix as a set of small dense blocks. Blocks can be linear (horizontal, vertical, or diagonal) or rectangular, a usual range of size is from 2 to 20.…”

Section: G Overview Of State-of-the-artmentioning

confidence: 99%

Efficient parallel evaluation of block properties of sparse matrices

Šimeček

Langr

2016

Annals of Computer Science and Information Systems

Self Cite

View full text Add to dashboard Cite

Abstract-Many storage formats for sparse matrices have been developed. Majority of these formats can be parametrized, so the algorithm for finding optimal parameters is crucial. For overall efficiency, it is important to reduce the execution time of this preprocessing. In this paper, we propose a new algorithm for the determination of the number of nonzero blocks of the given size in a sparse matrix. The proposed algorithm requires relatively a small amount of auxiliary memory. Our approach is based on the Morton reordering and bitwise manipulations. We also present a parallel (multithreaded) version and evaluate its performance and space complexity.

show abstract

Section: G Overview Of State-of-the-artmentioning

confidence: 99%

Efficient parallel evaluation of block properties of sparse matrices

Šimeček

Langr

2016

Annals of Computer Science and Information Systems

Self Cite

View full text Add to dashboard Cite

show abstract

“…Sparse matrices often contain dense submatrices (blocks), so various blocking SSFs were designed to accelerate matrix operations. Compared to the CSR format, the aim of these formats (like SPARSITY [6] or [16] or CARB [20,10]) is to allow a better use of registers and more efficient computations. But these specialized SSFs have usually large transformation overhead and consume approximately the same amount of memory as the CSR format.…”

Section: Representing Indexes In Binary Codesmentioning

confidence: 99%

Tree-based Space Efficient Formats for Storing the Structure of Sparse Matrices

Šimeček¹,

Langr²

2014

SCPE

View full text Add to dashboard Cite

Abstract.Sparse storage formats describe a way how sparse matrices are stored in a computer memory. Extensive research has been conducted about these formats in the context of performance optimization of the sparse matrix-vector multiplication algorithms, but memory efficient formats for storing sparse matrices are still under development, since the commonly used storage formats (like COO or CSR) are not sufficient. In this paper, we propose and evaluate new storage formats for sparse matrices that minimize the space complexity of information about matrix structure. The first one is based on arithmetic coding and the second one is based on binary tree format. We compare the space complexity of common storage formats and our new formats and prove that the latter are considerably more space efficient.Key words: sparse matrix representation; parallel execution; space efficiency; arithmetical-coding-based format; minimal binary tree format; minimal quadtree format; AMS subject classifications. 68M14, 68W10, 68P05, 68P20, 94A171. Introduction. The paper investigates memory-efficient storage formats for very large sparse matrices (LSMs). By LSMs, we mean matrices that due to their sizes must be stored and processed by massively parallel computer systems (MPCSs) with distributed memory architecture consisting of tens or hundreds of thousands of processor cores.Within our previous work [9,12,11,8,7], we have addressed weaknesses of previously developed solutions for space-efficient formats for storing of large sparse matrices. The space complexity of the representation of sparse matrices depends strongly on the used matrix storage format. A matrix of order n is considered to be sparse if it contains much less nonzero elements than n 2 . Some alternative definitions of sparse matrix can be found in [22]. In practice, a matrix is considered sparse if the ratio of nonzero elements drops bellow some threshold. Our research addresses computations with LSMs satisfying at least one of the following conditions:1. The LSM is used repeatedly and the computation of its elements is slow and it takes more time than its later reading from a file system. 2. Construction of a LSM is memory-intensive. It needs significant amount of memory for auxiliary data structures, typically of the same order of magnitude as the amount of memory required for storing the LSM itself. 3. A solver requires the LSM in another format than is produced by a matrix generator and the conversion between these formats cannot be performed effectively on-the-fly. 4. Computational tasks with LSMs need check-pointing and recovery from failures of the MPCSs. We assume that a distributed-memory parallel computation with a LSM needs longer time. To avoid recomputations in case of a system failure, we need to save a state of these long-run processes to allow fast recovery. This is especially important nowadays (and will be more in the future) when MPCSs consist of tens or hundreds of thousands of processor cores. If at least one of these conditions is met, we might nee...

show abstract

“…For testing purposes, we used 4 matrices 1. Matrix A is a random banded sparse matrix with n = × 2 10 4 ; n Z = × 87 10 4 . .…”

Section: Test Datamentioning

confidence: 99%

“…There are some formats, such as register blocking [4], that eliminate indirect addressing during SpM×V. Then, vector instructions can be used.…”

Section: Introductionmentioning

confidence: 99%

Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal

Šimeček¹

2008

Acta Polytech

View full text Add to dashboard Cite

Sparse matrix-vector multiplication (shortly SpM×V) is one of most common subroutines in numerical linear algebra. The problem is that the memory access patterns during SpM×V are irregular, and utilization of the cache can suffer from low spatial or temporal locality. Approaches to improve the performance of SpM×V are based on matrix reordering and register blocking. These matrix transformations are designed to handle randomly occurring dense blocks in a sparse matrix. The efficiency of these transformations depends strongly on the presence of suitable blocks. The overhead of reorganization of a matrix from one format to another is often of the order of tens of executions ofSpM×V. For this reason, such a reorganization pays off only if the same matrix A is multiplied by multiple different vectors, e.g., in iterative linear solvers.This paper introduces an unusual approach to accelerate SpM×V. This approach can be combined with other acceleration approaches andconsists of three steps:1) dividing matrix A into non-empty regions,2) choosing an efficient way to traverse these regions (in other words, choosing an efficient ordering of partial multiplications),3) choosing the optimal type of storage for each region.All these three steps are tightly coupled. The first step divides the whole matrix into smaller parts (regions) that can fit in the cache. The second step improves the locality during multiplication due to better utilization of distant references. The last step maximizes the machine computation performance of the partial multiplication for each region.In this paper, we describe aspects of these 3 steps in more detail (including fast and time-inexpensive algorithms for all steps). Ourmeasurements prove that our approach gives a significant speedup for almost all matrices arising from various technical areas.

show abstract

A New Diagonal Blocking Format and Model of Cache Behavior for Sparse Matrices

Cited by 16 publications

References 2 publications

Efficient parallel evaluation of block properties of sparse matrices

Efficient parallel evaluation of block properties of sparse matrices

Tree-based Space Efficient Formats for Storing the Structure of Sparse Matrices

Acceleration of Sparse Matrix-Vector Multiplication by Region Traversal

Contact Info

Product

Resources

About