2022
DOI: 10.48550/arxiv.2203.10789
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Domain Generalization by Mutual-Information Regularization with Pre-trained Models

Abstract: Domain generalization (DG) aims to learn a generalized model to an unseen target domain using only limited source domains. Previous attempts to DG fail to learn domain-invariant representations only from the source domains due to the significant domain shifts between training and test domains. Instead, we reformulate the DG objective using mutual information with the oracle model, a model generalized to any possible domain. We derive a tractable variational lower bound via approximating the oracle model by a p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(10 citation statements)
references
References 25 publications
0
10
0
Order By: Relevance
“…the in-domain strategy by Gulrajani & Lopez-Paz (2020)) to select the best hyper-parameters, and report the average performance and standard deviation across 5 random seeds. Baselines : We compare our method against standard ERM training, which has proven to be a frustratingly difficult baseline (Gulrajani & Lopez-Paz, 2020), and also against several state of the art methods on this benchmark -SWAD (Cha et al, 2021), MIRO (Cha et al, 2022) and SMA (Arpit et al, 2021). Finally, we show that our approach can be effectively integrated with stochastic weight averaging to obtain further gains.…”
Section: Ood Generalization In a Real World Settingmentioning
confidence: 98%
“…the in-domain strategy by Gulrajani & Lopez-Paz (2020)) to select the best hyper-parameters, and report the average performance and standard deviation across 5 random seeds. Baselines : We compare our method against standard ERM training, which has proven to be a frustratingly difficult baseline (Gulrajani & Lopez-Paz, 2020), and also against several state of the art methods on this benchmark -SWAD (Cha et al, 2021), MIRO (Cha et al, 2022) and SMA (Arpit et al, 2021). Finally, we show that our approach can be effectively integrated with stochastic weight averaging to obtain further gains.…”
Section: Ood Generalization In a Real World Settingmentioning
confidence: 98%
“…We hypothesize that the gradient bias mentioned above could be relieved if the unobservable gradient g u minimizing risks in the unseen domains is computable. To achieve this, we borrow the assumption of Cha et al (2022) that large-scale pre-trained models are the approximation of the oracle model θ * which is optimally generalized for any domain D. Since the unobservable gradient g u cannot be computed from the unseen domains D u directly, we consider the direction from the current model θ to the oracle model θ * as the unobservable gradient g u . However, the oracle model is inaccessible in practice.…”
Section: Gestur: Grdient Estimation For Unseen Domain Risk Minimizati...mentioning
confidence: 99%
“…Implementation details. Our implementation is built on the codebase of Cha et al (2022). We use Adam optimizer (Kingma & Ba, 2015) for parameter optimization.…”
Section: Gestur: Grdient Estimation For Unseen Domain Risk Minimizati...mentioning
confidence: 99%
See 2 more Smart Citations