Convex optimization over the spectrahedron, i.e., the set of all real n × n positive semidefinite matrices with unit trace, has important applications in machine learning, signal processing and statistics, mainly as a convex relaxation for optimization with low-rank matrices. It is also one of the most prominent examples in the theory of first-order methods for convex optimization in which non-Euclidean methods can be significantly preferable to their Euclidean counterparts, and in particular the Matrix Exponentiated Gradient (MEG) method which is based on the Bregman distance induced by the (negative) von Neumann entropy. Unfortunately, implementing MEG requires a full SVD computation on each iteration, which is not scalable to highdimensional problems.In this work we propose efficient implementations of MEG, both with deterministic and stochastic gradients, which are tailored for optimization with low-rank matrices, and only use a single low-rank SVD computation on each iteration. We also provide efficiently-computable certificates for the correct convergence of our methods. Mainly, we prove that under a strict complementarity condition, the suggested methods converge from a "warm-start" initialization with similar rates to their full-SVD-based counterparts. Finally, we bring empirical experiments which both support our theoretical findings and demonstrate the practical appeal of our methods.
Motivated by robust matrix recovery problems such as Robust Principal Component Analysis, we consider a general optimization problem of minimizing a smooth and strongly convex loss function applied to the sum of two blocks of variables, where each block of variables is constrained or regularized individually. We study a Conditional Gradient-Type method which is able to leverage the special structure of the problem to obtain faster convergence rates than those attainable via standard methods, under a variety of assumptions. In particular, our method is appealing for matrix problems in which one of the blocks corresponds to a low-rank matrix since it avoids prohibitive full-rank singular value decompositions required by most standard methods. While our initial motivation comes from problems which originated in statistics, our analysis does not impose any statistical assumptions on the data.
Low-rank and nonsmooth matrix optimization problems capture many fundamental tasks in statistics and machine learning. While significant progress has been made in recent years in developing efficient methods for smooth low-rank optimization problems that avoid maintaining high-rank matrices and computing expensive high-rank SVDs, advances for nonsmooth problems have been slow paced.In this paper we consider standard convex relaxations for such problems. Mainly, we prove that under a strict complementarity condition and under the relatively mild assumption that the nonsmooth objective can be written as a maximum of smooth functions, approximated variants of two popular mirror-prox methods: the Euclidean extragradient method and mirror-prox with matrix exponentiated gradient updates, when initialized with a "warm-start", converge to an optimal solution with rate O(1/t), while requiring only two low-rank SVDs per iteration. Moreover, for the extragradient method we also consider relaxed versions of strict complementarity which yield a trade-off between the rank of the SVDs required and the radius of the ball in which we need to initialize the method. We support our theoretical results with empirical experiments on several nonsmooth low-rank matrix recovery tasks, demonstrating both the plausibility of the strict complementarity assumption, and the efficient convergence of our proposed low-rank mirror-prox variants. * This manuscript significantly extends our NeurIPS 2021 paper [22] beyond the Euclidean Extragradient method and also considers non-Euclidean Mirrox-Prox with matrix exponentiated gradient updates.1 in [47,7,32] and [1] the authors consider SDPs with linear objective function and affine constraints of the form A(X) = b. By incorporating the linear constraints into the objective function via a ℓ2 penalty term of the form λ A(X) − b 2, λ > 0, we obtain a nonsmooth objective function.
Low-rank and nonsmooth matrix optimization problems capture many fundamental tasks in statistics and machine learning. While significant progress has been made in recent years in developing efficient methods for smooth low-rank optimization problems that avoid maintaining high-rank matrices and computing expensive high-rank SVDs, advances for nonsmooth problems have been slow paced.In this paper we consider standard convex relaxations for such problems. Mainly, we prove that under a natural generalized strict complementarity condition and under the relatively mild assumption that the nonsmooth objective can be written as a maximum of smooth functions, the extragradient method, when initialized with a "warm-start" point, converges to an optimal solution with rate O(1/t) while requiring only two low-rank SVDs per iteration. We give a precise trade-off between the rank of the SVDs required and the radius of the ball in which we need to initialize the method. We support our theoretical results with empirical experiments on several nonsmooth low-rank matrix recovery tasks, demonstrating that using simple initializations, the extragradient method produces exactly the same iterates when full-rank SVDs are replaced with SVDs of rank that matches the rank of the (low-rank) ground-truth matrix to be recovered.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.