2021
DOI: 10.48550/arxiv.2109.02355
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

Abstract: The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpolating) the training data, which is usually noisy. Such interpolation of noisy data is traditionally associated wit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
26
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 22 publications
(32 citation statements)
references
References 68 publications
(154 reference statements)
1
26
0
Order By: Relevance
“…Section 3 showed how reducing the number of parameters (to a point) lowers the achievable membership advantage. However, as has been shown in a variety of works (e.g., [9,23,10]) increasing the number of parameters leads to a "double descent" behavior where generalization performance improves, as captured in the following proposition similar to Thm. 1 of [23] but specialized to the asymptotic case.…”
Section: Mitigating Membership Inference Attacksmentioning
confidence: 85%
See 2 more Smart Citations
“…Section 3 showed how reducing the number of parameters (to a point) lowers the achievable membership advantage. However, as has been shown in a variety of works (e.g., [9,23,10]) increasing the number of parameters leads to a "double descent" behavior where generalization performance improves, as captured in the following proposition similar to Thm. 1 of [23] but specialized to the asymptotic case.…”
Section: Mitigating Membership Inference Attacksmentioning
confidence: 85%
“…The general motivation for this paper is the recent realization that privacy issues around machine learning might be exacerbated by today's trend towards increasingly overparameterized models that have more parameters than training data points and so can be trained to memorize (attain zero error on) the training data. Surprisingly, some overparameterized models (e.g., massive deep networks [7,8]) generalize extremely well [9,10]. Limited empirical evidence suggests that overparameterization and memorization may lead to greater privacy vulnerabilities [11,12,13].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, various methods were implemented, such as dropout and label smoothing, but conduct no obvious or even worsens the results. -Overparameterization of neural networks (Dar et al, 2021;Power et al, 2021), especially with a large number of epochs and heavy dropout rate, might be beneficial for long term fitting; -Multitask LSTM: adjust the model to predict secondary targets, thus adding additional targets to the original one to reduce overfitting and improve generalization. New targets such as pressure lags, cumulative pressure, and pressure variance at each timestep.…”
Section: Quantitative Resultsmentioning
confidence: 99%
“…The method is feasible since there are only 950 possible output values in the training set; -Perform data augmentation, e.g., masking augmentation randomly performed on the R and C only, random shuffling of units within a specific short window in a sequence, and mixup augmentation in which two sequences are randomly selected and mixed. Other augmentations such as Cutout and CutMix can also be implemented; -Perform Fourier or Wavelets transforms to eliminate noise, create approximations of different values movements and generalize short and long-term trends; -Overparameterization of neural networks(Dar et al, 2021;Power et al, 2021), especially with a large number of epochs and heavy dropout rate, might be beneficial for long term fitting; -Multitask LSTM: adjust the model to predict secondary targets, thus adding additional targets to the original one to reduce overfitting and improve generalization. New targets such as pressure lags, cumulative pressure, and pressure variance at each timestep.…”
mentioning
confidence: 99%