2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.01333
|View full text |Cite
|
Sign up to set email alerts
|

Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
30
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 34 publications
(44 citation statements)
references
References 11 publications
2
30
0
Order By: Relevance
“…Implications This exciting finding supports our interpretation that faster interpolation, as promoted by overparameterization, results in model functions which are overall low-complexity, due to least (but meaningful) deviation from initialization. Our proposed interpretation corroborates the empirical observations of Gamba et al (2022a) and Somepalli et al (2022), who respectively report that large models express low curvature model functions in input space, and consistent decision boundaries. Finally, our findings extend Neyshabur et al ( 2018), who initially reported that distance from initialization decreases for overparameterized models.…”
Section: Overparameterization Constrains Complexitysupporting
confidence: 91%
“…Implications This exciting finding supports our interpretation that faster interpolation, as promoted by overparameterization, results in model functions which are overall low-complexity, due to least (but meaningful) deviation from initialization. Our proposed interpretation corroborates the empirical observations of Gamba et al (2022a) and Somepalli et al (2022), who respectively report that large models express low curvature model functions in input space, and consistent decision boundaries. Finally, our findings extend Neyshabur et al ( 2018), who initially reported that distance from initialization decreases for overparameterized models.…”
Section: Overparameterization Constrains Complexitysupporting
confidence: 91%
“…In order to investigate how sample corruption affects margin measurements, we train several networks of increasing capacity to the point of interpolation (close to zero train error) on the widely used classification datasets MNIST [24] and CIFAR10 [25]. 'Toy problems' such as these are used extensively to probe generalization [15,23,13]. We corrupt the training data of some models using two types of noise, defined in Section 4.1, separately.…”
Section: Methodsmentioning
confidence: 99%
“…In order to find the nearest point on the decision boundary for all j, we search over each class j = i separately for each sample and choose the one with the smallest distance. As is convention for margin measurements [20,23], we use Euclidean distance * as metric, meaning the margin is given by: dist…”
Section: Formulating the Classification Marginmentioning
confidence: 99%
See 1 more Smart Citation
“…In principle, two sources could cause negative flips during model upgrade (a) the stochasticity during model training, including initializations, data loading order, and optimization process (Somepalli et al, 2022). (b) the distinctions between old and new model hypotheses, including architecture and pretraining data and procedure, leading to different representation space structures and prediction behaviors in terms of decision boundaries.…”
Section: Limitations Of New Model Ensemblementioning
confidence: 99%