2020
DOI: 10.48550/arxiv.2006.06049
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On Mixup Regularization

Abstract: Mixup is a data augmentation technique that creates new examples as convex combinations of training points and labels. This simple technique has empirically shown to improve the accuracy of many state-of-the-art models in different settings and applications, but the reasons behind this empirical success remain poorly understood. In this paper we take a substantial step in explaining the theoretical foundations of Mixup, by clarifying its regularization effects. We show that Mixup can be interpreted as standard… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
24
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
2

Relationship

1
8

Authors

Journals

citations
Cited by 17 publications
(25 citation statements)
references
References 15 publications
0
24
0
Order By: Relevance
“…The positive effect of this linear behavior in between samples questioned several authors who aimed at explaining theoretically and empirically Mixup. Carratino et al (2020) shows that Mixup can be interpreted as the combination of a data transformation and a data perturbation. A first transform shrinks both inputs and outputs towards their mean.…”
Section: Related Workmentioning
confidence: 99%
“…The positive effect of this linear behavior in between samples questioned several authors who aimed at explaining theoretically and empirically Mixup. Carratino et al (2020) shows that Mixup can be interpreted as the combination of a data transformation and a data perturbation. A first transform shrinks both inputs and outputs towards their mean.…”
Section: Related Workmentioning
confidence: 99%
“…We finally retrained this best hyperparameter setting on the combined train and validation sets. Dataset Method CIFAR (Krizhevsky, 2009) BatchEnsemble (Wen et al, 2020) Hyper-BatchEnsemble (Wenzel et al, 2020) MIMO (Havasi et al, 2020) Rank-1 BNN (Gaussian) (Dusenberry et al, 2020a) Rank-1 BNN (Cauchy) SNGP MC-Dropout (Gal and Ghahramani, 2016) Ensemble (Lakshminarayanan et al, 2016) Hyper-deep ensemble (Wenzel et al, 2020) Variational Inference (Blundell et al, 2015) Heteroscedastic (Collier et al, 2021) CLINC (Larson et al, 2019) SNGP MC-Dropout Ensemble Diabetic Retinopathy Detection (Filos et al, 2019) MC-Dropout Ensemble Radial Bayesian Neural Networks (Farquhar et al, 2020) Variational Inference ImageNet (Russakovsky et al, 2015) MixUp (Carratino et al, 2020…”
Section: Appendix a Uml Diagrammentioning
confidence: 99%
“…An immediate question is, does the added correlation lead to more meaningful representations? It is claimed that the strength of MixUp lies in causing the model to behave linearly between two images [41] or in pushing the examples towards their mean [4]. Both of these claims rely on the combined images to be generated from the same distribution.…”
Section: Occlusion Measurementmentioning
confidence: 99%