On Vector-Kronecker Product Multiplication with Rectangular Factors

Dayar, Tuǧrul; Orhan, M. Can

doi:10.1137/140980326

Cited by 11 publications

(8 citation statements)

References 32 publications

(22 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When the factors are relatively sparse, it may be more efficient to obtain nonzeros of the generator in Kronecker form on the fly and multiply them with corresponding elements of the vector [6]. Recently, the shuffle algorithm has been modified so that relevant elements of the vector are multiplied with submatrices of factors in which zero rows and columns are omitted [8]. This approach is shown to avoid unnecessary floating-point operations (flops) that evaluate to zero during the course of the multiplication and possibly reduces the amount of memory used.…”

Section: Introductionmentioning

confidence: 99%

Compact Representation of Solution Vectors in Kronecker-Based Markovian Analysis

Buchholz

Dayar

Kriege

et al. 2016

Quantitative Evaluation of Systems

Self Cite

View full text Add to dashboard Cite

Abstract.It is well known that the infinitesimal generator underlying a multi-dimensional Markov chain with a relatively large reachable state space can be represented compactly on a computer in the form of a block matrix in which each nonzero block is expressed as a sum of Kronecker products of smaller matrices. Nevertheless, solution vectors used in the analysis of such Kronecker-based Markovian representations still require memory proportional to the size of the reachable state space, and this becomes a bigger problem as the number of dimensions increases. The current paper shows that it is possible to use the hierarchical Tucker decomposition (HTD) to store the solution vectors during Kroneckerbased Markovian analysis relatively compactly and still carry out the basic operation of vector-matrix multiplication in Kronecker form relatively efficiently. Numerical experiments on two different problems of varying sizes indicate that larger memory savings are obtained with the HTD approach as the number of dimensions increases.

show abstract

Section: Introductionmentioning

confidence: 99%

Compact Representation of Solution Vectors in Kronecker-Based Markovian Analysis

Buchholz

Dayar

Kriege

et al. 2016

Quantitative Evaluation of Systems

Self Cite

View full text Add to dashboard Cite

show abstract

“…This comparison is justified by a considerable gain in both memory efficiency and lower CPU usage when using GTA formalism. Dayar and Orhan work [8] is primarily based on optimizing the execution of the shuffle algorithm in order to improve data locality. The optimization also reduces the number of FLOPS, this is accomplished by focusing the computation on the nonzero values of the matrices and thus trying to avoid FLOPS that use zero rows and columns.…”

Section: Related Workmentioning

confidence: 99%

“…The kronecker product matrix is defined as a block matrix formed with a special multiplication between two matrices [8]. The problem is that given N square matrices A (i) of order n i and a vector x ∈ R 1×L where L = N n=1 n i , the complexity of building this matrix is ( N i=1 n 2 i ).…”

Section: Introductionmentioning

confidence: 99%

Performance Analysis and Optimization of the Vector-Kronecker Product Multiplication

Azevedo

Bentes

Castro

et al. 2020

2020 IEEE 32nd International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)

View full text Add to dashboard Cite

The Kronecker product, also called tensor product, is a fundamental matrix algebra operation, used to model complex systems using structured descriptions. This operation needs to be computed efficiently, since it is a critical kernel for iterative algorithms. In this work, we focus on the vector-kronecker product operation, where we present an in-depth performance analysis of a sequential and a parallel algorithm previously proposed. Based on this analysis, we proposed three optimizations: changing the memory access pattern, reducing load imbalance and manually vectorizing some portions of the code with Intel SSE4.2 intrinsics. The obtained results show better cache usage and load balance, thus improving the performance, especially for larger matrices.

show abstract

“…In practice, the matrices Q k,h are sparse [3] and held in sparse row format since the nonzeros in each of its rows indicate the possible transitions from the state with that row index. The advantage of partitioning the reachable state space is the elimination of unreachable states from the set of rows and columns of the generator to avoid unnecessary computational effort (see, for instance, [2,15]) due to unreachable states and to use vectors not larger than |R| in the analysis. The Kronecker form of the blocks Q (i,j) in Q has been studied before for a number of models [21][22][23].…”

Section: Compact Vectors In Kronecker Settingmentioning

confidence: 99%

“…Starting from an initial solution, the compact vector in HTD format was iteratively multiplied with the uniformized generator matrix of a given CTMC in Kronecker form 1000 times. The same numerical experiment was performed with a solution vector the same size as the reachable state space size using an improved version of the shuffle algorithm [15]. For a fixed truncation error tolerance strategy in the HTD format, the two approaches were compared for memory, time, and accuracy, leading to the preliminary conclusion that compact vectors in HTD format become more memory efficient as the number of dimensions increases.…”

Section: Introductionmentioning

confidence: 99%

On compact solution vectors in Kronecker-based Markovian analysis

Buchholz

Dayar

Kriege

et al. 2017

Performance Evaluation

View full text Add to dashboard Cite

State based analysis of stochastic models for performance and dependability often requires the computation of the stationary distribution of a multidimensional continuous-time Markov chain (CTMC). The infinitesimal generator underlying a multidimensional CTMC with a large reachable state space can be represented compactly in the form of a block matrix in which each nonzero block is expressed as a sum of Kronecker products of smaller matrices. However, solution vectors used in the analysis of such Kronecker-based Markovian representations require memory proportional to the size of the reachable state space. This implies that memory allocated to solution vectors becomes a bottleneck as the size of the reachable state space increases. Here, it is shown that the hierarchical Tucker decomposition (HTD) can be used with adaptive truncation strategies to store the solution vectors during Kronecker-based Markovian analysis compactly and still carry out the basic operations including vector-matrix multiplication in Kronecker form within Power, Jacobi, and Generalized Minimal Residual methods. Numerical experiments on multidimensional problems of varying sizes indicate that larger memory savings are obtained with the HTD approach as the number of dimensions increases.

show abstract

On Vector-Kronecker Product Multiplication with Rectangular Factors

Cited by 11 publications

References 32 publications

Compact Representation of Solution Vectors in Kronecker-Based Markovian Analysis

Compact Representation of Solution Vectors in Kronecker-Based Markovian Analysis

Performance Analysis and Optimization of the Vector-Kronecker Product Multiplication

On compact solution vectors in Kronecker-based Markovian analysis

Contact Info

Product

Resources

About