We propose a restricted SVD based CUR (RSVD-CUR) decomposition for matrix triplets (A, B, G). Given matrices A, B, and G of compatible dimensions, such a decomposition provides a coordinated low-rank approximation of the three matrices using a subset of their rows and columns. We pick the subset of rows and columns of the original matrices by applying either the discrete empirical interpolation method (DEIM) or the L-DEIM scheme on the orthogonal and nonsingular matrices from the restricted singular value decomposition of the matrix triplet. We investigate the connections between a DEIM type RSVD-CUR approximation and a DEIM type CUR factorization, and a DEIM type generalized CUR decomposition. We provide an error analysis that shows that the accuracy of the proposed RSVD-CUR decomposition is within a factor of the approximation error of the restricted singular value decomposition of given matrices. An RSVD-CUR factorization may be suitable for applications where we are interested in approximating one data matrix relative to two other given matrices. Two applications that we discuss include multi-view/label dimension reduction, and data perturbation problems of the form A E = A + BF G, where BF G is a nonwhite noise matrix. In numerical experiments, we show the advantages of the new method over the standard CUR approximation for these applications.
A CUR factorization is often utilized as a substitute for the singular value decomposition (SVD), especially when a concrete interpretation of the singular vectors is challenging. Moreover, if the original data matrix possesses properties like nonnegativity and sparsity, a CUR decomposition can better preserve them compared to the SVD. An essential aspect of this approach is the methodology used for selecting a subset of columns and rows from the original matrix. This study investigates the effectiveness of one-round sampling and iterative subselection techniques and introduces new iterative subselection strategies based on iterative SVDs. One provably appropriate technique for index selection in constructing a CUR factorization is the discrete empirical interpolation method (DEIM). Our contribution aims to improve the approximation quality of the DEIM scheme by iteratively invoking it in several rounds, in the sense that we select subsequent columns and rows based on the previously selected ones. That is, we modify A after each iteration by removing the information that has been captured by the previously selected columns and rows. We also discuss how iterative procedures for computing a few singular vectors of large data matrices can be integrated with the new iterative subselection strategies. We present the results of numerical experiments, providing a comparison of one-round sampling and iterative subselection techniques, and demonstrating the improved approximation quality associated with using the latter.
The discrete empirical interpolation method (DEIM) may be used as an index selection strategy for formulating a CUR factorization. A notable drawback of the original DEIM algorithm is that the number of column or row indices that can be selected is limited to the number of input singular vectors. We propose a new variant of DEIM, which we call L-DEIM, a combination of the strength of deterministic leverage scores and DEIM. This method allows for the selection of a number of indices greater than the number of input singular vectors. Since DEIM requires singular vectors as input matrices, L-DEIM is particularly attractive for example in big data problems when computing a rank-k SVD approximation is expensive even for moderately small k since it uses a lower-rank SVD approximation instead of the full rank-k SVD. We empirically demonstrate the performance of L-DEIM, which despite its efficiency, may achieve comparable results to the original DEIM and even better approximations than some state-of-the-art methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.