2009
DOI: 10.1214/08-aos659
|View full text |Cite
|
Sign up to set email alerts
|

Some sharp performance bounds for least squares regression with L1 regularization

Abstract: We derive sharp performance bounds for least squares regression with L1 regularization from parameter estimation accuracy and feature selection quality perspectives. The main result proved for L1 regularization extends a similar result in [Ann. Statist. 35 (2007) 2313-2351] for the Dantzig selector. It gives an affirmative answer to an open question in [Ann. Statist. 35 (2007) 2358-2364]. Moreover, the result leads to an extended view of feature selection that allows less restrictive conditions than some recen… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

9
189
0

Year Published

2010
2010
2017
2017

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 185 publications
(198 citation statements)
references
References 19 publications
9
189
0
Order By: Relevance
“…A variety of sparsity-inducing penalty functions have been proposed to approximate the ℓ 0 term: exponential concave function [4], ℓ p -norm with 0 < p < 1 [15] and p < 0 [71], Smoothly Clipped Absolute Deviation (SCAD) [13], Logarithmic function [82], Capped-ℓ 1 [26] (see (21), (22) and Table 1 in Section 3 for the definition of these functions). Using these approximations, several algorithms have been developed for resulting optimization problems, most of them are in the context of feature selection in classification, sparse regressions or more especially for sparse signal recovery: Successive Linear Approximation (SLA) algorithm [4], DCA (Difference of Convex functions Algorithm) based algorithms [11,12,16,21,28,42,43,51,54,63,65], Local Linear Approximation (LLA) [87], Two-stage ℓ 1 [83], Adaptive Lasso [86], reweighted-ℓ 1 algorithms [8]), reweighted-ℓ 2 algorithms such as Focal Underdetermined System Solver (FOCUSS) ( [18,71,72]), Iteratively reweighted least squares (IRLS) and Local Quadratic Approximation (LQA) algorithm [13,87].…”
Section: Portfolio Selection Problem With Cardinality Constraintmentioning
confidence: 99%
“…A variety of sparsity-inducing penalty functions have been proposed to approximate the ℓ 0 term: exponential concave function [4], ℓ p -norm with 0 < p < 1 [15] and p < 0 [71], Smoothly Clipped Absolute Deviation (SCAD) [13], Logarithmic function [82], Capped-ℓ 1 [26] (see (21), (22) and Table 1 in Section 3 for the definition of these functions). Using these approximations, several algorithms have been developed for resulting optimization problems, most of them are in the context of feature selection in classification, sparse regressions or more especially for sparse signal recovery: Successive Linear Approximation (SLA) algorithm [4], DCA (Difference of Convex functions Algorithm) based algorithms [11,12,16,21,28,42,43,51,54,63,65], Local Linear Approximation (LLA) [87], Two-stage ℓ 1 [83], Adaptive Lasso [86], reweighted-ℓ 1 algorithms [8]), reweighted-ℓ 2 algorithms such as Focal Underdetermined System Solver (FOCUSS) ( [18,71,72]), Iteratively reweighted least squares (IRLS) and Local Quadratic Approximation (LQA) algorithm [13,87].…”
Section: Portfolio Selection Problem With Cardinality Constraintmentioning
confidence: 99%
“…The theoretical properties of the estimator with the Lasso penalty have been extensively studied in the literature. For instance, the sparsity oracle inequalities and statistical rate of convergence of the Lasso estimator are established by Meinshausen and Yu (2009);Bickel et al (2009); Bunea et al (2007); van de Geer (2008); Zhang (2009);Negahban et al (2012), and the variable selection consistency is studied by Meinshausen and Bühlmann (2006); Zhao and Yu (2006); Wainwright (2009). The class of nonconvex penalties includes MCP (Zhang, 2010a), SCAD (Fan and Li, 2001), and capped-L 1 penalty (Zhang, 2010b), among others.…”
Section: Introductionmentioning
confidence: 99%
“…We present in Section 7.3 some computation times for the more involved case with non-convex objective function using a block coordinate descent generalized EM algorithm. Regarding statistical theory, almost all results for high-dimensional Lasso-type problems have been developed for convex loss functions, e.g., the squared error in a Gaussian regression (Greenshtein and Ritov, 2004;Meinshausen and Bühlmann, 2006;Zhao and Yu, 2006;Bunea et al, 2007;Zhang and Huang, 2008;Meinshausen and Yu, 2009;Wainwright, 2009;Bickel et al, 2009;Cai et al, 2009b;Candès and Plan, 2009;Zhang, 2009) or the negative log-likelihood in a generalized linear model . We present a non-trivial modification of the mathematical analysis of ℓ 1 -penalized estimation with convex loss to non-convex but smooth likelihood problems.…”
Section: Introductionmentioning
confidence: 99%