2020
DOI: 10.1162/tacl_a_00328
|View full text |Cite
|
Sign up to set email alerts
|

PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models

Abstract: Pivot-based neural representation models have led to significant progress in domain adaptation for NLP. However, previous research following this approach utilize only labeled data from the source domain and unlabeled data from the source and target domains, but neglect to incorporate massive unlabeled corpora that are not necessarily drawn from these domains. To alleviate this, we propose PERL: A representation learning model that extends contextualized word embedding models such as BERT (Devlin et al., 2019 … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
41
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 40 publications
(43 citation statements)
references
References 28 publications
0
41
0
Order By: Relevance
“…This source-only model even surpasses state-of-the-art methods developed for UDA, e.g. R-PERL (Ben-David et al, 2020).…”
Section: Comparison To State-of-the-artmentioning
confidence: 89%
“…This source-only model even surpasses state-of-the-art methods developed for UDA, e.g. R-PERL (Ben-David et al, 2020).…”
Section: Comparison To State-of-the-artmentioning
confidence: 89%
“…• Two-stage fine-tuning introduces an intermediate supervised training stage between pre-training and fine-tuning Arase and Tsujii, 2019;Pruksachatkun et al, 2020;Glavaš and Vulić, 2020). Ben-David et al (2020) propose a pivot-based variant of MLM to fine-tune BERT for domain adaptation.…”
Section: Fine-tuning Bertmentioning
confidence: 99%
“…Work on the intersection of data-centric and model-centric methods can be plentiful. It currently includes combining semi-supervised objectives with an adversarial loss (Lim et al, 2020;Alam et al, 2018b), combining pivot-based approaches with pseudo-labeling (Cui and Bollegala, 2019) and very recently with contextualized word embeddings (Ben-David et al, 2020), and combining multi-task approaches with domain shift (Jia et al, 2019), multi-task learning with pseudo-labeling (multi-task tritraining) (Ruder and Plank, 2018), and adaptive ensembling (Desai et al, 2019), which uses a studentteacher network with a consistency-based self-ensembling loss and a temporal curriculum. They apply adaptive ensembling to study temporal and topic drift in political data classification (Desai et al, 2019).…”
Section: Hybrid Approachesmentioning
confidence: 99%