A neural interlingua for multilingual machine translation

Lu, Yichao; Keung, Phillip; Ladhak, Faisal; Bhardwaj, Vikas; Zhang, Shaonan; Sun, Jason

doi:10.18653/v1/w18-6309

Cited by 96 publications

(94 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this paper we focus on models that allow the translation between many languages, where we outline the development of a languageindependent representation based on an attention bridge that is shared across all languages. This is in contrast with previous attempts to obtain such a "neural interlingua" (Lu et al, 2018), where the authors have only tested theirs under a one-to-many and many-to-one scenario. In order to do this, we propose an architecture based on shared self-attention for multilingual NMT with language-specific encoders and decoders, that achieves comparable results to the current state-of-the-art architectures and can as well address the task of obtaining language-independent sentence embeddings.…”

Section: Introductionmentioning

confidence: 83%

Multilingual NMT with a Language-Independent Attention Bridge

Vázquez

Raganato

Tiedemann

et al. 2019

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

View full text Add to dashboard Cite

In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate attention bridge that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.

show abstract

Section: Introductionmentioning

confidence: 83%

Multilingual NMT with a Language-Independent Attention Bridge

Vázquez

Raganato

Tiedemann

et al. 2019

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

View full text Add to dashboard Cite

show abstract

“…explore sharing various components in self-attentional (Transformer) models. Lu et al (2018) add a shared "interlingua" layer while using separate encoders and decoders. Zaremoodi et al (2018) utilize recurrent units with multiple blocks together with a trainable routing network.…”

Section: Multilinguality and Zero-shot Performancementioning

confidence: 99%

Massively Multilingual Neural Machine Translation

Aharoni¹,

Johnson

Fırat

2019

Proceedings of the 2019 Conference of the North

301

317

View full text Add to dashboard Cite

Multilingual neural machine translation (NMT) enables training a single model that supports translation from multiple source languages into multiple target languages. In this paper, we push the limits of multilingual NMT in terms of the number of languages being used. We perform extensive experiments in training massively multilingual NMT models, translating up to 102 languages to and from English within a single model. We explore different setups for training such models and analyze the trade-offs between translation quality and various modeling decisions. We report results on the publicly available TED talks multilingual corpus where we show that massively multilingual many-to-many models are effective in low resource settings, outperforming the previous state-of-the-art while supporting up to 59 languages. Our experiments on a large-scale dataset with 102 languages to and from English and up to one million examples per direction also show promising results, surpassing strong bilingual baselines and encouraging future work on massively multilingual NMT.

show abstract

“…Our pivot adapter (Section 3.2) shares the same motivation with the interlingua component of Lu et al (2018), but is much compact, independent of variable input length, and easy to train offline. The adapter training algorithm is adopted from bilingual word embedding mapping (Xing et al, 2015).…”

Section: Related Workmentioning

confidence: 99%

Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages

Kim¹,

Petrov²,

Petrushkov³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

We present effective pre-training strategies for neural machine translation (NMT) using parallel corpora involving a pivot language, i.e., source-pivot and pivot-target, leading to a significant improvement in source→target translation. We propose three methods to increase the relation among source, pivot, and target languages in the pre-training: 1) step-wise training of a single model for different language pairs, 2) additional adapter component to smoothly connect pre-trained encoder and decoder, and 3) cross-lingual encoder training via autoencoding of the pivot language. Our methods greatly outperform multilingual models up to +2.6% BLEU in WMT 2019 French→German and German→Czech tasks. We show that our improvements are valid also in zero-shot/zeroresource scenarios.

show abstract

A neural interlingua for multilingual machine translation

Cited by 96 publications

References 14 publications

Multilingual NMT with a Language-Independent Attention Bridge

Multilingual NMT with a Language-Independent Attention Bridge

Massively Multilingual Neural Machine Translation

Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages

Contact Info

Product

Resources

About