Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization

Arpit, Devansh; Wang, Huan; Zhou, Yingbo; Xiong, Caiming

doi:10.48550/arxiv.2110.10832

Cited by 6 publications

(8 citation statements)

References 25 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…the in-domain strategy by Gulrajani & Lopez-Paz (2020)) to select the best hyper-parameters, and report the average performance and standard deviation across 5 random seeds. Baselines : We compare our method against standard ERM training, which has proven to be a frustratingly difficult baseline (Gulrajani & Lopez-Paz, 2020), and also against several state of the art methods on this benchmark -SWAD (Cha et al, 2021), MIRO (Cha et al, 2022) and SMA (Arpit et al, 2021). Finally, we show that our approach can be effectively integrated with stochastic weight averaging to obtain further gains.…”

Section: Ood Generalization In a Real World Settingmentioning

confidence: 98%

“…Tackling the OOD robustness problem, Thomas et al (2021) and Matsuura & Harada (2020) first cluster training examples into "pseudo-domains", after which standard domain generalization techniques are used. Another recent line of works propose using model averaging (Cha et al, 2021;Li et al, 2022) and/or ensembling (Arpit et al, 2021) for better OOD generalization. These techniques are complementary to our contribution, and we demonstrate how they can benefit each other in our empirical evaluation.…”

Section: Domain Generalization and Ood Robustnessmentioning

confidence: 99%

See 1 more Smart Citation

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks

Addepalli¹,

Nasery²,

Babu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep Neural Networks are known to be brittle to even minor distribution shifts compared to the training distribution. While one line of work has demonstrated that Simplicity Bias (SB) of DNNs -bias towards learning only the simplest features -is a key reason for this brittleness, another recent line of work has surprisingly found that diverse/ complex features are indeed learned by the backbone, and their brittleness is due to the linear classification head relying primarily on the simplest features. To bridge the gap between these two lines of work, we first hypothesize and verify that while SB may not altogether preclude learning complex features, it amplifies simpler features over complex ones. Namely, simple features are replicated several times in the learned representations while complex features might not be replicated. This phenomenon, we term Feature Replication Hypothesis, coupled with the Implicit Bias of SGD to converge to maximum margin solutions in the feature space, leads the models to rely mostly on the simple features for classification. To mitigate this bias, we propose Feature Reconstruction Regularizer (FRR) to ensure that the learned features can be reconstructed back from the logits. The use of FRR in linear layer training (FRR-L) encourages the use of more diverse features for classification. We further propose to finetune the full network by freezing the weights of the linear layer trained using FRR-L, to refine the learned features, making them more suitable for classification. Using this simple solution, we demonstrate up to 15% gains in OOD accuracy on the recently introduced semi-synthetic datasets with extreme distribution shifts. Moreover, we demonstrate noteworthy gains over existing SOTA methods on the standard OOD benchmark DomainBed as well.

show abstract

Section: Ood Generalization In a Real World Settingmentioning

confidence: 98%

Section: Domain Generalization and Ood Robustnessmentioning

confidence: 99%

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks

Addepalli¹,

Nasery²,

Babu³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…PTMs for domain generalization. Methods leveraging pretraining models have shown promising improvements in domain generalization performance (Wiles et al, 2022;Arpit et al, 2021;Dong et al, 2022;Wortsman et al, 2022;Rame et al, 2022;Ramé et al, 2022). Among them, ensemble methods combined with PTMs show further advantages.…”

Section: Related Workmentioning

confidence: 99%

Decomposed adversarial domain generalization

Chen

2023

Knowledge-Based Systems

View full text Add to dashboard Cite

“…However, compared to the RHS of Equation 15, our proposed metrics could bring two advantages. Firstly, our proposed metrics are easier to approximate with finite samples in practice (as shown in Section 4.3 in the main paper and A.1 and A.2 in Appendix) while the estimation of KL divergence is challenging [Wang et al, 2021;Zhao et al, 2020]. Secondly, our proposed metrics have close connections with the error of models (as shown in Theorem 4.2 and Theorem 4.3), so that they are more befitting the evaluation of DG datasets for benchmarking DG algorithms.…”

Section: A3 Comparison Between the Proposed Metrics And Kullback-leib...mentioning

confidence: 99%

“…The distribution shift between training and test data may lead to the unreliable performance of most current approaches in practice. Hence, instead of generalization within the training distribution, the ability to generalize under distribution shift, namely domain generalization (DG) [Wang et al, 2021;, is of more critical significance in realistic scenarios. † Equal contribution * Corresponding Author ‡ The dataset can be found at https://www.dropbox.com/sh/u2bq2xo8sbax4pr/AADbhZJAy0AAbap76cg _ XkAfa?dl=0.…”

Section: Introductionmentioning

confidence: 99%

NICO++: Towards Better Benchmarking for Domain Generalization

Zhang¹,

He²,

Xu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Despite the remarkable performance that modern deep neural networks have achieved on independent and identically distributed (I.I.D.) data, they can crash under distribution shifts. Most current evaluation methods for domain generalization (DG) adopt the leave-one-out strategy as a compromise on the limited number of domains. We propose a large-scale benchmark with extensive labeled domains named NICO ++ ‡ along with more rational evaluation methods for comprehensively evaluating DG algorithms. To evaluate DG datasets, we propose two metrics to quantify covariate shift and concept shift, respectively. Two novel generalization bounds from the perspective of data construction are proposed to prove that limited concept shift and significant covariate shift favor the evaluation capability for generalization. Through extensive experiments, NICO ++ shows its superior evaluation capability compared with current DG datasets and its contribution in alleviating unfairness caused by the leak of oracle knowledge in model selection.

show abstract

Ensemble of Averages: Improving Model Selection and Boosting Performance in Domain Generalization

Cited by 6 publications

References 25 publications

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks

Learning an Invertible Output Mapping Can Mitigate Simplicity Bias in Neural Networks

Decomposed adversarial domain generalization

NICO++: Towards Better Benchmarking for Domain Generalization

Contact Info

Product

Resources

About