2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.00705
|View full text |Cite
|
Sign up to set email alerts
|

Safe Self-Refinement for Transformer-based Domain Adaptation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 52 publications
(22 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…These methods can either explicitly align the feature distributions using a specific distance metric [34,36,53], or implicitly align the distributions using an adversarial loss [8,13,35] or GAN [21,41]. While most current works in domain adaptation focus on image classification [1,2,11,15,24,29,30,37,38,39,46,54], a few have delved into object detection [6,10,16,23,40,45]. Data mixing [64,66] is appealing in the context of UDA because of the opportunity to strategically blend cross-domain information during training.…”
Section: Cross-domain Object Detectionmentioning
confidence: 99%
“…These methods can either explicitly align the feature distributions using a specific distance metric [34,36,53], or implicitly align the distributions using an adversarial loss [8,13,35] or GAN [21,41]. While most current works in domain adaptation focus on image classification [1,2,11,15,24,29,30,37,38,39,46,54], a few have delved into object detection [6,10,16,23,40,45]. Data mixing [64,66] is appealing in the context of UDA because of the opportunity to strategically blend cross-domain information during training.…”
Section: Cross-domain Object Detectionmentioning
confidence: 99%
“…TVT [46] proposes an adaptation module to capture domain data's transferable and discriminative features. SSRT [36] proposes a framework with a transformer backbone and a safe self-refinement strategy to handle the issue in case of a large domain gap. More recently, CD-Trans [45] proposes a two-step framework that utilizes the cross-attention in ViT for direct feature alignment and pregenerated pseudo labels for the target samples.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, for fine-tuning purposes, we set the classifier (MLP) with a higher learning rate 1e −5 for our main tasks and learn the trade-off parameter adap-tively. For a fair comparison with prior works, we also conduct experiments with the same backbone Deit-based [37] as CDTrans [45], and ViT-based [11] as SSRT [36] on Office-31, Office-Home, and VisDA-2017. These two studies are trained for 60 and 100 epochs separately.…”
Section: Datasets and Implementationmentioning
confidence: 99%
See 2 more Smart Citations