“…A plethora of algorithms are proposed: learning invariant representation across domains [7,21,38,20], minimizing the weighted combination of risks from training domains [35], using different risk penalty terms to facilitate invariance prediction [1,17], causal inference approaches [31], and forcing the learned representation different from a set of pre-defined biased representations [2], mixup-based approaches [48,41,26], etc. A recent study [10] shows that no domain generalization methods achieve superior performance than ERM across a broad range of datasets.…”