2021
DOI: 10.48550/arxiv.2106.02266
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SAND-mask: An Enhanced Gradient Masking Strategy for the Discovery of Invariances in Domain Generalization

Soroosh Shahtalebi,
Jean-Christophe Gagnon-Audet,
Touraj Laleh
et al.

Abstract: A major bottleneck in the real-world applications of machine learning models is their failure in generalizing to unseen domains whose data distribution is not i.i.d to the training domains. This failure often stems from learning non-generalizable features in the training domains that are spuriously correlated with the label of data. To address this shortcoming, there has been a growing surge of interest in learning good explanations that are hard to vary, which is studied under the notion of Out-of-Distributio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(13 citation statements)
references
References 24 publications
(55 reference statements)
0
13
0
Order By: Relevance
“…The reduction of the representation distribution mismatch across source domains can be achieved by minimizing the maximum mean discrepancy criteria (Gretton et al 2012) combined with an adversarial autoencoder (MMD) (Li et al 2018b), minimizing the difference between the means (Tzeng et al 2014) or covariance matrices (CORAL) (Sun and Saenko 2016) in the embedding space across different domains, or minimizing a con-trastive loss as a regularization (Motiian et al 2017;Yoon, Hamarneh, and Garbi 2019;Mahajan, Tople, and Sharma 2020), e.g., SelfReg (Kim et al 2021). The domain alignment is also addressed by promoting the loss gradient alignment across different domains via inner product maximization (Fish) (Shi et al 2021) or binary (AND-mask) (Parascandolo et al 2020;Shahtalebi et al 2021) or continuous gradient masking (SAND-mask) (Shahtalebi et al 2021).…”
Section: Domain Generalizationmentioning
confidence: 99%
“…The reduction of the representation distribution mismatch across source domains can be achieved by minimizing the maximum mean discrepancy criteria (Gretton et al 2012) combined with an adversarial autoencoder (MMD) (Li et al 2018b), minimizing the difference between the means (Tzeng et al 2014) or covariance matrices (CORAL) (Sun and Saenko 2016) in the embedding space across different domains, or minimizing a con-trastive loss as a regularization (Motiian et al 2017;Yoon, Hamarneh, and Garbi 2019;Mahajan, Tople, and Sharma 2020), e.g., SelfReg (Kim et al 2021). The domain alignment is also addressed by promoting the loss gradient alignment across different domains via inner product maximization (Fish) (Shi et al 2021) or binary (AND-mask) (Parascandolo et al 2020;Shahtalebi et al 2021) or continuous gradient masking (SAND-mask) (Shahtalebi et al 2021).…”
Section: Domain Generalizationmentioning
confidence: 99%
“…Domain alignment methods attempt to learn a domain-invariant representation of the data from the source domains by regularizing the learning objective. Variants of such a regularization include the minimization across the source domains of the maximum mean discrepancy criteria (MMD) [21,35], the minimization of a distance metric between the domain-specific means [71] or covariance matrices [69], the minimization of a contrastive loss [50,83,44,29], or the maximization of loss gradient alignment [65,63]. Other works use adversarial training with a domain discriminator model [18,37] for the same purpose.…”
Section: Domain Generalizationmentioning
confidence: 99%
“…Though proved advantageous in some curated test conditions, their proposed mechanism is intrinsically using arithmetic averaging of Hessians. Furthermore, AND-masking on the average loss landscape might induce "dead zones", where the gradients from different environments are sometimes unable to update the parameters unless their directions strictly match with each other (19).…”
Section: Introductionmentioning
confidence: 99%
“…Nevertheless, the masking strategy requires the direction to which the gradients from different environments point strictly match with each other in order to allow the gradients to update that particular parameter. To address this issue, Soroosh proposed a smoothed AND-masking algorithm (SAND-mask), which not only validates the invariance in the direction of gradients, but also promotes the agreement among the gradient magnitudes (19). Yet, regarding their experiment results, the proposed masking strategy barely competes the performance of other algorithms, including AND-masking strategy.…”
Section: Introductionmentioning
confidence: 99%