Safe Self-Refinement for Transformer-based Domain Adaptation

Sun, Tao; Lu, Cheng; Zhang, Tianshuo; Ling, Haibin

doi:10.1109/cvpr52688.2022.00705

Cited by 52 publications

(22 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These methods can either explicitly align the feature distributions using a specific distance metric [34,36,53], or implicitly align the distributions using an adversarial loss [8,13,35] or GAN [21,41]. While most current works in domain adaptation focus on image classification [1,2,11,15,24,29,30,37,38,39,46,54], a few have delved into object detection [6,10,16,23,40,45]. Data mixing [64,66] is appealing in the context of UDA because of the opportunity to strategically blend cross-domain information during training.…”

Section: Cross-domain Object Detectionmentioning

confidence: 99%

LossMix: Simplify and Generalize Mixup for Object Detection and Beyond

Vu¹,

Sun²,

Yuan³

et al. 2023

Preprint

View full text Add to dashboard Cite

The success of data mixing augmentations in image classification tasks has been well-received. However, these techniques cannot be readily applied to object detection due to challenges such as spatial misalignment, foreground/background distinction, and plurality of instances. To tackle these issues, we first introduce a novel conceptual framework called Supervision Interpolation, which offers a fresh perspective on interpolation-based augmentations by relaxing and generalizing Mixup. Building on this framework, we propose LossMix, a simple yet versatile and effective regularization that enhances the performance and robustness of object detectors and more. Our key insight is that we can effectively regularize the training on mixed data by interpolating their loss errors instead of ground truth labels. Empirical results on the PASCAL VOC and MS COCO datasets demonstrate that LossMix consistently outperforms currently popular mixing strategies. Furthermore, we design a two-stage domain mixing method that leverages LossMix to surpass Adaptive Teacher (CVPR 2022) and set a new state of the art for unsupervised domain adaptation.* Work done during a residency with Mineral.

show abstract

Section: Cross-domain Object Detectionmentioning

confidence: 99%

LossMix: Simplify and Generalize Mixup for Object Detection and Beyond

Vu¹,

Sun²,

Yuan³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…TVT [46] proposes an adaptation module to capture domain data's transferable and discriminative features. SSRT [36] proposes a framework with a transformer backbone and a safe self-refinement strategy to handle the issue in case of a large domain gap. More recently, CD-Trans [45] proposes a two-step framework that utilizes the cross-attention in ViT for direct feature alignment and pregenerated pseudo labels for the target samples.…”

Section: Related Workmentioning

confidence: 99%

“…Furthermore, for fine-tuning purposes, we set the classifier (MLP) with a higher learning rate 1e −5 for our main tasks and learn the trade-off parameter adap-tively. For a fair comparison with prior works, we also conduct experiments with the same backbone Deit-based [37] as CDTrans [45], and ViT-based [11] as SSRT [36] on Office-31, Office-Home, and VisDA-2017. These two studies are trained for 60 and 100 epochs separately.…”

Section: Datasets and Implementationmentioning

confidence: 99%

“…A significant line of solutions reduces the domain gap based on the category-level alignment which produces pseudo labels for the target samples, such as metric learning [14,53], adversarial training [12,17,34], and optimal transport [44]. Furthermore, several works [11,36] explore the potential of ViT for the non-trivial UDA task. Recently, CDTrans [45] exploits the cross-attention in ViT for direct domain alignment, buttressed by the crafted pseudo labels for target samples.…”

Section: Introductionmentioning

confidence: 99%

“…We conduct experiments on four benchmark datasets, including Office-31 [33], Office-Home [40], VisDA-2017 [32], and DomainNet [31]. The results show that the performance of PMTrans significantly surpasses that of the ViT-based [36,45,46] and CNN-based SoTA methods [18,29,35] by +3.6% on Office-Home, +1.4% on Office-31, and +17.7% on DomainNet (See Fig. 1), respectively.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective

Zhu¹,

Bai²,

Wang³

2023

Preprint

View full text Add to dashboard Cite

Endeavors have been recently made to leverage the vision transformer (ViT) for the challenging unsupervised domain adaptation (UDA) task. They typically adopt the cross-attention in ViT for direct domain alignment. However, as the performance of cross-attention highly relies on the quality of pseudo labels for targeted samples, it becomes less effective when the domain gap becomes large. We solve this problem from a game theory's perspective with the proposed model dubbed as PMTrans, which bridges source and target domains with an intermediate domain. Specifically, we propose a novel ViT-based module called PatchMix that effectively builds up the intermediate domain, i.e., probability distribution, by learning to sample patches from both domains based on the game-theoretical models. This way, it learns to mix the patches from the source and target domains to maximize the cross entropy (CE), while exploiting two semi-supervised mixup losses in the feature and label spaces to minimize it. As such, we interpret the process of UDA as a min-max CE game with three players, including the feature extractor, classifier, and PatchMix, to find the Nash Equilibria. Moreover, we leverage attention maps from ViT to re-weight the label of each patch by its importance, making it possible to obtain more domain-discriminative feature representations. We conduct extensive experiments on four benchmark datasets, and the results show that PM-Trans significantly surpasses the ViT-based and CNN-based SoTA methods by +3.6% on Office-Home, +1.4% on Office-31, and +17.7% on DomainNet, respectively. https: //vlis2022.github.io/cvpr23/PMTrans

show abstract

Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation

Chen

Liu

Wang

et al. 2020

Domain Adaptation in Computer Vision With Deep Learning

View full text Add to dashboard Cite

Unsupervised Domain Adaptation (UDA) aims to transfer domain knowledge from existing well-defined tasks to new ones where labels are unavailable. In the real-world applications, as the domain (task) discrepancies are usually uncontrollable, it is significantly motivated to match the feature distributions even if the domain discrepancies are disparate. Additionally, as no label is available in the target domain, how to successfully adapt the classifier from the source to the target domain still remains an open question. In this paper, we propose the Re-weighted Adversarial Adaptation Network (RAAN) to reduce the feature distribution divergence and adapt the classifier when domain discrepancies are disparate. Specifically, to alleviate the need of common supports in matching the feature distribution, we choose to minimize optimal transport (OT) based Earth-Mover (EM) distance and reformulate it to a minimax objective function. Utilizing this, RAAN can be trained in an end-to-end and adversarial manner. To further adapt the classifier, we propose to match the label distribution and embed it into the adversarial training. Finally, after extensive evaluation of our method using UDA datasets of varying difficulty, RAAN achieved the state-of-the-art results and outperformed other methods by a large margin when the domain shifts are disparate.

show abstract

Safe Self-Refinement for Transformer-based Domain Adaptation

Cited by 52 publications

References 15 publications

LossMix: Simplify and Generalize Mixup for Object Detection and Beyond

LossMix: Simplify and Generalize Mixup for Object Detection and Beyond

Patch-Mix Transformer for Unsupervised Domain Adaptation: A Game Perspective

Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation

Contact Info

Product

Resources

About