2021
DOI: 10.48550/arxiv.2104.07078
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

UDALM: Unsupervised Domain Adaptation through Language Modeling

Abstract: In this work we explore Unsupervised Domain Adaptation (UDA) of pretrained language models for downstream tasks. We introduce UDALM, a fine-tuning procedure, using a mixed classification and Masked Language Model loss, that can adapt to the target domain distribution in a robust and sample efficient manner. Our experiments show that performance of models trained with the mixed loss scales with the amount of available target data and the mixed loss can be effectively used as a stopping criterion during UDA trai… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3

Citation Types

0
3
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 8 publications
(10 reference statements)
0
3
0
Order By: Relevance
“…Unsupervised domain adaption (UDA) is an essential task in the realm of deep learning since it mitigates the expensive burden of manual annotation by focusing on cheap unlabeled data from target domains [Ramponi and Plank, 2020]. Among all existing approaches for UDA, pre-trained language model (PrLM) based approaches become the de-facto standard [Gururangan et al, 2020, Ben-David et al, 2020, Yu et al, 2021, Karouzos et al, 2021 since these PrLMs are equipped with generic knowledge learned from large corpora [Howard and Ruder, 2018] and lead to promising results. The primary focuses of UDA methods are to capture the transferable features for the target domain while reserving the knowledge learned from the source domain [Blitzer et al, 2006, Pan et al, 2010.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Unsupervised domain adaption (UDA) is an essential task in the realm of deep learning since it mitigates the expensive burden of manual annotation by focusing on cheap unlabeled data from target domains [Ramponi and Plank, 2020]. Among all existing approaches for UDA, pre-trained language model (PrLM) based approaches become the de-facto standard [Gururangan et al, 2020, Ben-David et al, 2020, Yu et al, 2021, Karouzos et al, 2021 since these PrLMs are equipped with generic knowledge learned from large corpora [Howard and Ruder, 2018] and lead to promising results. The primary focuses of UDA methods are to capture the transferable features for the target domain while reserving the knowledge learned from the source domain [Blitzer et al, 2006, Pan et al, 2010.…”
Section: Introductionmentioning
confidence: 99%
“…The primary focuses of UDA methods are to capture the transferable features for the target domain while reserving the knowledge learned from the source domain [Blitzer et al, 2006, Pan et al, 2010. However, most existing pre-training-based UDA approaches are carried out by fine-tuning the entire set of model parameters on domain-specific corpora [Gururangan et al, 2020, Yu et al, 2021, Karouzos et al, 2021, which are usually of limited sizes. Such a setting may easily drift the PrLM to a specified domain and distort the generic knowledge embedded in the original PrLM weights , He et al, 2021.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation