Blocking and parallelization of the Hari–Zimmermann variant of the Falk–Langemeyer algorithm for the generalized SVD

Novaković, Vedran; Singer, Sanja

doi:10.1016/j.parco.2015.06.004

Cited by 19 publications

(22 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If A and B are instead given implicitly by their factors F and G (not necessarily square nor with the same number of rows), respectively, such that ( A , B ) = ( F ∗ F , G ∗ G ) , then the GEVD of ( A , B ) can be computed implicitly, i.e. without assembling A and B in entirety from the factors, by a modification of the Hari–Zimmermann algorithm (Novaković et al, 2015). However, pivot submatrices of A and B of a certain, usually small order are formed explicitly throughout the computation.…”

Section: Introductionmentioning

confidence: 99%

“…The recent work (Novaković et al, 2015) has shown that such method can be blocked and parallelized for the shared memory nodes and for the clusters of those, albeit only the real matrix pairs were considered therein. Even the sequential but blocked version outperformed the GSVD algorithm in LAPACK (Anderson et al, 1999), and the parallel ones exhibited a decent scalability.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Implicit Hari–Zimmermann algorithm for the generalized SVD on the GPUs

Novaković

Singer

2020

The International Journal of High Performance Computing Applica

Self Cite

View full text Add to dashboard Cite

A parallel, blocked, one-sided Hari–Zimmermann algorithm for the generalized singular value decomposition (GSVD) of a real or a complex matrix pair [Formula: see text] is here proposed, where F and G have the same number of columns, and are both of the full column rank. The algorithm targets either a single graphics processing unit (GPU), or a cluster of those, performs all non-trivial computation exclusively on the GPUs, requires the minimal amount of memory to be reasonably expected, scales acceptably with the increase of the number of GPUs available, and guarantees the reproducible, bitwise identical output of the runs repeated over the same input and with the same number of GPUs.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Implicit Hari–Zimmermann algorithm for the generalized SVD on the GPUs

Novaković

Singer

2020

The International Journal of High Performance Computing Applica

Self Cite

View full text Add to dashboard Cite

show abstract

“…Nonetheless, the iteration matrices occasionally need normalization during the process. When the FL method is used as a kernel algorithm for the block Jacobi method, this can be a demanding task on contemporary CPU and GPU parallel computing machines [6]. Fortunately, one can use other Jacobi methods as kernel algorithms, in particular the Hari-Zimmermann (HZ) and the Cholesky-Jacobi (CJ) method (see [1,8]) which can be seen as normalized versions of the FL method.…”

Section: Introductionmentioning

confidence: 99%

On the Accuracy of the Element‐wise Jacobi Methods for PGEP

Matejaš

Hari

2018

Proc Appl Math and Mech

View full text Add to dashboard Cite

We analyze the relative accuracy of two new element-wise Jacobi-type methods for the positive definite generalized eigenvalue problem Ax = λBx, where A and B are symmetric matrices and B is positive definite. A detailed error analysis is used, and the appropriate numerical tests are performed. If A and B are well-behaved positive definite matrices then the transformation parameters will have small relative errors and numerical tests indicate the high relative accuracy of the methods.

show abstract

“…This typically happens in the course of modelling the parameters of a system. The described properties make the method an excellent choice for the kernel algorithm of the block-Jacobi methods which are nowadays the prime choice of the methods for solving the definite GEP on contemporary parallel CPU and GPU computing machines [7].In this short communication, we present the main formulas of the complex Falk-Langemeyer (CFL) algorithm that are derived in [3]. Although they are the proper generalization of those in the real method, their derivation is not trivial.…”

mentioning

confidence: 99%

mentioning

confidence: 99%

On the complex Falk–Langemeyer method

Hari

2019

Numer Algor

View full text Add to dashboard Cite

The known Falk-Langemeyer method for the simultaneous diagonalization of two positive definite symmetric matrices is generalized to work with complex matrices. It is shown that the derived method is well defined for the Hermitian matrices which make a definite pair. Special attention is paid to the stability of the formulas for the computational parameters when the pivot submatrices are close to being proportional. Numerical tests indicate the high relative accuracy of the method provided that both matrices are definite and well-behaved, i.e. if they can be well-scaled symmetrically.In 1960 Falk and Langemeyer [1] proposed a Jacobi-type method for solving the generalized eigenvalue problem (GEP) Ax = λBx, x = 0 with symmetric positive definite matrices A, B. Later Slapničar and Hari [8] proved the quadratic convergence of the method in the case of simple eigenvalues. They also proved that the method was well defined for a more general, definite pair of symmetric matrices. In 2015 Matejaš [6] proved sharp error estimates for the method. Our numerical tests indicate the high relative accuracy of the method in the case of positive definite matrices that can be wellscaled symmetrically, i.e. when the condition numbers of D A AD A and D B BD B are small for some diagonal matrices D A and D B . Since it is a Jacobi method for the GEP it is very efficient and highly accurate when both matrices are almost diagonal (cf. [5]). This typically happens in the course of modelling the parameters of a system. The described properties make the method an excellent choice for the kernel algorithm of the block-Jacobi methods which are nowadays the prime choice of the methods for solving the definite GEP on contemporary parallel CPU and GPU computing machines [7].In this short communication, we present the main formulas of the complex Falk-Langemeyer (CFL) algorithm that are derived in [3]. Although they are the proper generalization of those in the real method, their derivation is not trivial. The formulas for the transformation parameters become useless when the pivot submatrices are proportional or very close to being proportional. In that case, several stable formulas are proposed in [3]. It is proved that the new complex algorithm is well defined when some real linear combination of the complex Hermitian matrices A and B is positive definite. Numerical tests strongly indicate the high relative accuracy of the method when both matrices are positive definite and can be well-scaled symmetrically. The quadratic convergence of the method can be proved following the original proof from [8]. The proof of the global convergence requires more preliminary work (cf. [2,4]) while the proof of the high relative accuracy seems like a more demanding task.

show abstract

Blocking and parallelization of the Hari–Zimmermann variant of the Falk–Langemeyer algorithm for the generalized SVD

Cited by 19 publications

References 27 publications

Implicit Hari–Zimmermann algorithm for the generalized SVD on the GPUs

Implicit Hari–Zimmermann algorithm for the generalized SVD on the GPUs

On the Accuracy of the Element‐wise Jacobi Methods for PGEP

On the complex Falk–Langemeyer method

Contact Info

Product

Resources

About