2022
DOI: 10.48550/arxiv.2203.08124
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Can Neural Nets Learn the Same Model Twice? Investigating Reproducibility and Double Descent from the Decision Boundary Perspective

Abstract: We discuss methods for visualizing neural network decision boundaries and decision regions. We use these visualizations to investigate issues related to reproducibility and generalization in neural network training. We observe that changes in model architecture (and its associate inductive bias) cause visible changes in decision boundaries, while multiple runs with the same architecture yield results with strong similarities, especially in the case of wide architectures. We also use decision boundary methods t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 29 publications
(42 reference statements)
0
5
0
Order By: Relevance
“…Transfer learning priors range from the high entropy distribution provided by training from scratch to those pretrained models which seem to consistently select a single basin [35]. Possible influences on basin selection, and therefore on generalization strategies, may include length of pretraining [49], data scheduling, and architecture selection [43]. The strength of a prior towards particular basins may be not only linked to training procedure, but also strongly related to the availability of features in the pretrained representations [29,16,42].…”
Section: Discussion and Future Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Transfer learning priors range from the high entropy distribution provided by training from scratch to those pretrained models which seem to consistently select a single basin [35]. Possible influences on basin selection, and therefore on generalization strategies, may include length of pretraining [49], data scheduling, and architecture selection [43]. The strength of a prior towards particular basins may be not only linked to training procedure, but also strongly related to the availability of features in the pretrained representations [29,16,42].…”
Section: Discussion and Future Workmentioning
confidence: 99%
“…However, variation on performance in diagnostic sets is even more substantial, from social biases [40] to unusual paraphrases [31,56]. Benton et al [1] found that a wide variety of decision boundaries were expressed within a low-loss volume, and Somepalli et al [43] further found that there is diversity in boundaries during OOD generalization, far off the data manifold. Our work contributes to diversity in generalization by linking sets of models that share low dimensional subspaces to particular OOD generalization behavior.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Using three sample images as a base "triplet", the region of the decision space that lies between the triplet samples can be visualized. Vectors representing the positions of two of the triplet samples in relation to the third are used to construct a vector space, which is used to facilitate interpolation of the sample triplet, creating a vicinal distribution [8]. A "virtual" data set composed of images is sampled from the vicinal distribution.…”
Section: Decision Regions and Boundariesmentioning
confidence: 99%
“…The width of the T can be interpreted as the inductive bias of the learning algorithm. The decision boundaries of neural networks usually lie on the manifold of the data [25]; and the network behaves more smoothly off the data manifold. A natural consequence of this is that the head of the Ts will be large.…”
Section: Inductive Bias Can Hurt Robustness Even Furthermentioning
confidence: 99%