Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.381
|View full text |Cite
|
Sign up to set email alerts
|

Vocabulary Adaptation for Domain Adaptation in Neural Machine Translation

Abstract: Neural network methods exhibit strong performance only in a few resource-rich domains. Practitioners therefore employ domain adaptation from resource-rich domains that are, in most cases, distant from the target domain. Domain adaptation between distant domains (e.g., movie subtitles and research papers), however, cannot be performed effectively due to mismatches in vocabulary; it will encounter many domain-specific words (e.g., "angstrom") and words whose meanings shift across domains (e.g., "conductor"). In … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(16 citation statements)
references
References 25 publications
0
15
0
Order By: Relevance
“…Lewis et al (2020) first replace the embedding layer with an independent encoder, of which vocabulary and parameters are learned from the downstream corpus. Along this line, Sato et al (2020) exploit external monolingual data to construct a new embedding layer and achieve improvements in domain adaptation. This series of studies empirically confirm the necessity of the suitable vocabulary for the finetuning stage.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Lewis et al (2020) first replace the embedding layer with an independent encoder, of which vocabulary and parameters are learned from the downstream corpus. Along this line, Sato et al (2020) exploit external monolingual data to construct a new embedding layer and achieve improvements in domain adaptation. This series of studies empirically confirm the necessity of the suitable vocabulary for the finetuning stage.…”
Section: Related Workmentioning
confidence: 99%
“…Pretrain-finetune paradigm has been highly successful on tackling challenging problems in natural language processing, e.g., domain adaptation (Sato et al, 2020;Yao et al, 2020), incremental learning (Khayrallah et al, 2018;Wan et al, 2020), as well as knowledge transferring (Liu et al, 2020b). The rise of large-scale pre-trained language models further attracts increasing attention towards this strategy (Devlin et al, 2019;Edunov et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…However, it has been reported that adversarial typos can degrade a BERT model that uses subword tokenization (Pruthi et al, 2019;Sun et al, 2020). Subword meanings change across domains, making domain adaptation difficult (Sato et al, 2020). These problems are more critical in the processing of noisy text (Wang et al, 2020;Niu et al, 2020).…”
Section: Bosmentioning
confidence: 99%
“…The dynamic nature of language and the limited size of training data requires neural network models to handle out-of-vocabulary (OOV) words that are absent from the training data. We thus use an UNK embedding shared among diverse OOV words or break those OOV words into semanticallyambiguous subwords (even characters), leading to poor task performance (Peng et al, 2019;Sato et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
“…Distant domain transfer learning (DDTL) is a pretty common scenario in real transfer learning applications [1][2][3][4][5]. However, there is one key challenge to overcome, whereby the current transfer learning approaches will not work well when the source domain is very distant from the target domain.…”
Section: Introductionmentioning
confidence: 99%