Parallel Alprirhms and App1irar;onr. V o l 13, pp. 265-187 0 1999 OPA (Overseas Publirhcrr Asocialion) N.V. Reprinlr uvailablc directly from the publisher Publirhcd by lhccnse under Photocopying milled by liwnsc only the Gordon and Breach SEicncc Publirhcn imprint.The paper presents parallel algorithms for efficient solution of the Singular Value Decomposition (SVD) problem by the block two-sided Jacobi method. In this part of the work. we show how the method may be used on M l M D computers with hypercube and ring topologies. We analyse three types of orderings for solving SVD on block-structured submatrices from the point or view of communication requirements and suitability for parallel execution of the computational process. The algorithms map well onto the hypercube topology. Two of the ordering schemes can also be directly implemented on rings. Results obtained on an Intel Paragon are shown and discussed for all the three types of orderings.
Five variants of a new dynamic ordering are presented for the parallel one-sided block Jacobi SVD algorithm. Similarly to the two-sided algorithm, the dynamic ordering takes into account the actual status of a matrix-this time of its block columns with respect to their mutual orthogonality. Variants differ in the computational and communication complexities and in proposed global and local stopping criteria. Their performance is tested on a square random matrix of order 8192 with a random distribution of singular values using p = 16, 32, 64, 96 and 128 processors. All variants of dynamic ordering are compared with a parallel cyclic ordering, two-sided block-Jacobi method with dynamic ordering and the ScaLAPACK routine PDGESVD with respect to the number of parallel iteration steps needed for the convergence and total parallel execution time. Moreover, the relative errors in the orthogonality of computed left singular vectors and in the matrix assembled from computed singular triplets are also discussed. It turns out that the variant 3, for which a local optimality in some precisely defined sense can be proved, and its combination with variant 2, are the most efficient ones. For relatively small blocking factors = 2p, they outperform the ScaLAPACK procedure PDGESVD and are about 2 times faster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.