2020
DOI: 10.1109/jsait.2020.2984716
|View full text |Cite
|
Sign up to set email alerts
|

Harmless Interpolation of Noisy Data in Regression

Abstract: A continuing mystery in understanding the empirical success of deep neural networks is their ability to achieve zero training error and generalize well, even when the training data is noisy and there are more parameters than data points. We investigate this overparameterized regime in linear regression, where all solutions that minimize training error interpolate the data, including noise. We characterize the fundamental generalization (mean-squared) error of any interpolating solution in the presence of noise… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
17
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 55 publications
(18 citation statements)
references
References 57 publications
1
17
0
Order By: Relevance
“…Deng et al [24] characterize logistic regression test error using the Gaussian min-max theorem. Muthukumar et al [25] provides bounds on the risk.…”
Section: Related Workmentioning
confidence: 99%
“…Deng et al [24] characterize logistic regression test error using the Gaussian min-max theorem. Muthukumar et al [25] provides bounds on the risk.…”
Section: Related Workmentioning
confidence: 99%
“…The reluctance to overfit inhibited exploration of a range of settings where y(x) = β int , x provided optimal or near-optimal predictions. Very recently, these 'harmless interpolation' (Muthukumar, Vodrahalli, Subramanian and Sahai 2020b) or 'benign over-fitting' (Bartlett, Long, Lugosi and Tsigler 2020) regimes have become a very active direction of research, a development inspired by efforts to understand deep learning. In particular, provided a spectral characterization of models exhibiting this behaviour.…”
Section: When Do Minimum Norm Predictors Generalize?mentioning
confidence: 99%
“…Recent empirical evidence has shown that certain algorithms, contrary to classical learning theory, can interpolate noisy data (achieve zero training error) while also generalizing well out of sample (low test error) [2,10,14]. We have also seen this phenomenon rigorously analyzed in theory for parametric methods such as linear regression, and random feature regression [1,3,7,9], as well as non-parametric methods such as kernel regression with singular kernels [4][5][6].…”
Section: Introductionmentioning
confidence: 99%