Summary. We introduce a general formulation for dimension reduction and coefficient estimation in the multivariate linear model. We argue that many of the existing methods that are commonly used in practice can be formulated in this framework and have various restrictions. We continue to propose a new method that is more flexible and more generally applicable. The method proposed can be formulated as a novel penalized least squares estimate. The penalty that we employ is the coefficient matrix's Ky Fan norm. Such a penalty encourages the sparsity among singular values and at the same time gives shrinkage coefficient estimates and thus conducts dimension reduction and coefficient estimation simultaneously in the multivariate linear model. We also propose a generalized cross-validation type of criterion for the selection of the tuning parameter in the penalized least squares. Simulations and an application in financial econometrics demonstrate competitive performance of the new method. An extension to the non-parametric factor model is also discussed.
In this paper we consider sparse approximation problems, that is, general l 0 minimization problems with the l 0 -"norm" of a vector being a part of constraints or objective function. In particular, we first study the first-order optimality conditions for these problems. We then propose penalty decomposition (PD) methods for solving them in which a sequence of penalty subproblems are solved by a block coordinate descent (BCD) method. Under some suitable assumptions, we establish that any accumulation point of the sequence generated by the PD methods satisfies the first-order optimality conditions of the problems. Furthermore, for the problems in which the l 0 part is the only nonconvex part, we show that such an accumulation point is a local minimizer of the problems. In addition, we show that any accumulation point of the sequence generated by the BCD method is a saddle point of the penalty subproblem. Moreover, for the problems in which the l 0 part is the only nonconvex part, we establish that such an accumulation point is a local minimizer of the penalty subproblem. Finally, we test the performance of our PD methods by applying them to sparse logistic regression, sparse inverse covariance selection, and compressed sensing problems. The computational results demonstrate that our methods generally outperform the existing methods in terms of solution quality and/or speed.
In this paper we analyze the randomized block-coordinate descent (RBCD) methods proposed in Nesterov (SIAM J Optim 22 (2):341-362, 2012), Richtárik and Takáč (Math Program 144(1-2):1-38, 2014) for minimizing the sum of a smooth convex function and a block-separable convex function, and derive improved bounds on their convergence rates. In particular, we extend Nesterov's technique developed in Nesterov (SIAM J Optim 22 (2):341-362, 2012) for analyzing the RBCD method for minimizing a smooth convex function over a block-separable closed convex set to the aforementioned more general problem and obtain a sharper expected-value type of convergence rate than the one implied in Richtárik and Takáč (Math Program 144 (1-2):1-38, 2014). As a result, we also obtain a better high-probability type of iteration complexity. In addition, for unconstrained smooth convex minimization, we develop a new technique called randomized estimate sequence to analyze the accelerated RBCD method proposed by Nesterov (SIAM J Optim 22(2):341-362, 2012) and establish a sharper expected-value type of convergence rate than the one given in Nesterov (SIAM J Optim 22(2):341-362, 2012).
We consider the problem of minimizing the sum of two convex functions: one is smooth and given by a gradient oracle, and the other is separable over blocks of coordinates and has a simple known structure over each block. We develop an accelerated randomized proximal coordinate gradient (APCG) method for minimizing such convex composite functions. For strongly convex functions, our method achieves faster linear convergence rates than existing randomized proximal coordinate gradient methods. Without strong convexity, our method enjoys accelerated sublinear convergence rates. We show how to apply the APCG method to solve the regularized empirical risk minimization (ERM) problem, and devise efficient implementations that avoid full-dimensional vector operations. For ill-conditioned ERM problems, our method obtains improved convergence rates than the state-of-the-art stochastic dual coordinate ascent (SDCA) method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.