Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.203
|View full text |Cite
|
Sign up to set email alerts
|

UDALM: Unsupervised Domain Adaptation through Language Modeling

Abstract: In this work we explore Unsupervised Domain Adaptation (UDA) of pretrained language models for downstream tasks. We introduce UDALM, a fine-tuning procedure, using a mixed classification and Masked Language Model loss, that can adapt to the target domain distribution in a robust and sample efficient manner. Our experiments show that performance of models trained with the mixed loss scales with the amount of available target data and the mixed loss can be effectively used as a stopping criterion during UDA trai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
20
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(33 citation statements)
references
References 44 publications
1
20
0
Order By: Relevance
“…Zero-Shot Models: We apply supervised training on MS MARCO or PAQ and evaluate the trained retrievers on the target datasets. Previous Domain Adaptation Methods: We include two previous unsupervised domain adaptation methods, UDALM (Karouzos et al, 2021) and MoDIR Thakur et al (2021b) to train QGen models with the default setting. Co-12 https://github.com/UKPLab/beir sine similarity is used and the models are fine-tuned for 1 epoch with MNRL.…”
Section: Baselinesmentioning
confidence: 99%
“…Zero-Shot Models: We apply supervised training on MS MARCO or PAQ and evaluate the trained retrievers on the target datasets. Previous Domain Adaptation Methods: We include two previous unsupervised domain adaptation methods, UDALM (Karouzos et al, 2021) and MoDIR Thakur et al (2021b) to train QGen models with the default setting. Co-12 https://github.com/UKPLab/beir sine similarity is used and the models are fine-tuned for 1 epoch with MNRL.…”
Section: Baselinesmentioning
confidence: 99%
“…The evaluation results on the original BioASQ and TREC-COVID are available at Appendix C. Evaluation is done using nDCG@10. Previous Domain Adaptation Methods: We include two previous unsupervised domain adaptation methods, UDALM (Karouzos et al, 2021) and MoDIR . uses the default setting in original paper, where 15% tokens in a text are sampled to be masked and are needed to be predicted.…”
Section: Discussionmentioning
confidence: 99%
“…MoDIR trains models by generating domain invariant representations to attack a domain classifier. However, as argued in Karouzos et al (2021), DAT trains models by minimizing the distance between representations from different domains and such learning objective can result in bad embedding space and unstable performance. For sentiment classification, Karouzos et al (2021) proposes UDALM based on multiple stages of training.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…While we use lessons about careful design choices, our basic models use standard fine-tuning for simplicity. Efforts on the data side have focused on intermediate fine-tuning either by using unlabeled target domain data (Karouzos et al, 2021;Gururangan et al, 2020) or via labeled data from other tasks (Phang et al, 2018;Aghajanyan et al, 2021;Vu et al, 2020).…”
Section: Related Workmentioning
confidence: 99%