2019
DOI: 10.48550/arxiv.1909.05122
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Implicit Regularization for Optimal Sparse Recovery

Abstract: We investigate implicit regularization schemes for gradient descent methods applied to unpenalized least squares regression to solve the problem of reconstructing a sparse signal from an underdetermined system of linear measurements under the restricted isometry assumption. For a given parametrization yielding a non-convex optimization problem, we show that prescribed choices of initialization, step size and stopping time yield a statistically and computationally optimal algorithm that achieves the minimax rat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 30 publications
0
3
0
Order By: Relevance
“…The linear diagonal neural networks we consider have been studied in the case of gradient descent [33] and stochastic gradient descent with label noise [15]. In both cases the authors show that this model has the ability to implicitly bias the training procedure to help retrieve a sparse predictor.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The linear diagonal neural networks we consider have been studied in the case of gradient descent [33] and stochastic gradient descent with label noise [15]. In both cases the authors show that this model has the ability to implicitly bias the training procedure to help retrieve a sparse predictor.…”
Section: Related Workmentioning
confidence: 99%
“…For the sake of completeness, the study of diagonal linear networks of arbitrary depth p ≥ 3 is done in Appendix E.2. Also note that additionally to being a toy neural model, it has received recent attention on its own for its practical ability to induce sparsity [33,34,15] or to solve phase retrieval problems [37].…”
Section: Notationsmentioning
confidence: 99%
“…The quadratic parametrisations which we consider have become popular lately (Vaškevičius et al, 2019) since, despite their simplicity, they already enable to grasp the complexity of more general networks. Indeed, they highlight important aspects of the theoretical concerns of modern machine learning: the neural tangent kernel regime, the roles of overparametrisation and of the initialisation (Woodworth et al, 2020).…”
Section: Introductionmentioning
confidence: 99%