Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1120
|View full text |Cite
|
Sign up to set email alerts
|

Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies

Abstract: Transfer learning or multilingual model is essential for low-resource neural machine translation (NMT), but the applicability is limited to cognate languages by sharing their vocabularies. This paper shows effective techniques to transfer a pre-trained NMT model to a new, unrelated language without shared vocabularies.We relieve the vocabulary mismatch by using cross-lingual word embedding, train a more language-agnostic encoder by injecting artificial noises, and generate synthetic data easily from the pre-tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
63
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
2

Relationship

1
9

Authors

Journals

citations
Cited by 62 publications
(65 citation statements)
references
References 38 publications
1
63
0
Order By: Relevance
“…LSTM is a type of recurrent neutral network (RNN). LSTM achieved great success in many applications, such as unconstrained handwriting recognition [46], speech recognition [47], handwriting generation [35], machine translation [48], etc. Each step of the LSTM has a series of repeated neural network templates.…”
Section: Inter-atomic Long-dependence Feature Extraction Methods Basedmentioning
confidence: 99%
“…LSTM is a type of recurrent neutral network (RNN). LSTM achieved great success in many applications, such as unconstrained handwriting recognition [46], speech recognition [47], handwriting generation [35], machine translation [48], etc. Each step of the LSTM has a series of repeated neural network templates.…”
Section: Inter-atomic Long-dependence Feature Extraction Methods Basedmentioning
confidence: 99%
“…This mapping is learned via the orthogonal Procrustes method [125] using bilingual dictionaries between the sources and the target language [61]. Kim et al [71] proposed a variant of this approach where the parent model is first trained and monolingual word-embeddings of the child source are mapped to the parent source's embeddings prior to fine-tuning. While Gu et al [54] require the child and parent sources to be mapped while training the parent model, the mapping in Kim et al [71]'s model can be trained after the parent model has been trained.…”
Section: Lexical Transfermentioning
confidence: 99%
“…Nguyen and Chiang (2017) and Kocmi and Bojar (2018) with more languages and help target language switches. Kim et al (2019) propose additional techniques to enable NMT transfer even without shared vocabularies. To the best of our knowledge, we are the first to propose transfer learning strategies specialized in utilizing a pivot language, transferring a source encoder and a target decoder at the same time.…”
Section: Related Workmentioning
confidence: 99%