2020
DOI: 10.1073/pnas.1907378117
|View full text |Cite
|
Sign up to set email alerts
|

Benign overfitting in linear regression

Abstract: The phenomenon of benign overfitting is one of the key mysteries uncovered by deep learning methodology: deep neural networks seem to predict well, even with a perfect fit to noisy training data. Motivated by this phenomenon, we consider when a perfect fit to training data in linear regression is compatible with accurate prediction. We give a characterization of gaussian linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy. The characterizatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

11
403
1
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 322 publications
(433 citation statements)
references
References 16 publications
11
403
1
1
Order By: Relevance
“…Additionally, they study the misspecified model asymptotically and show that double descent can occur with sufficiently improved approximability, and replicate the phenomena in a specific non-linear model. [8] sharply upper and lower bound the (non-asymptotic) generalization error of the 2 -minimizing interpolator for Gaussian features (which are not whitened/independent in general). They characterize necessary and sufficient conditions for the 2 -minimizing interpolator to avoid what we call signal "bleed" and noise overfitting in terms of functionals of the spectrum of the Gaussian covariance matrix.…”
Section: Concurrent Work In High-dimensional Linear Regressionmentioning
confidence: 99%
See 3 more Smart Citations
“…Additionally, they study the misspecified model asymptotically and show that double descent can occur with sufficiently improved approximability, and replicate the phenomena in a specific non-linear model. [8] sharply upper and lower bound the (non-asymptotic) generalization error of the 2 -minimizing interpolator for Gaussian features (which are not whitened/independent in general). They characterize necessary and sufficient conditions for the 2 -minimizing interpolator to avoid what we call signal "bleed" and noise overfitting in terms of functionals of the spectrum of the Gaussian covariance matrix.…”
Section: Concurrent Work In High-dimensional Linear Regressionmentioning
confidence: 99%
“…An earlier edition of our work was presented at Information Theory and Applications, February 2019 and subsequently accepted to IEEE International Symposium on Information Theory, July 2019. Several elegant and interesting papers [6][7][8][9][10]46] have appeared around this time. All of these center around the analysis of the 2 -minimizing interpolator.…”
Section: Concurrent Work In High-dimensional Linear Regressionmentioning
confidence: 99%
See 2 more Smart Citations
“…Another reason why good solutions can be found so easily by stochastic gradient descent is that, unlike low-dimensional models where a unique solution is sought, different networks with good performance converge from random starting points in parameter space. Because of over-parameterization (13), the degeneracy of solutions changes the nature of the problem from finding a needle in a haystack to a haystack of needles.…”
Section: Origins Of Deep Learning I Have Written a Book The Deep Lementioning
confidence: 99%