Evaluating the Supervised and Zero-shot Performance of Multi-lingual Translation Models

Hokamp, Chris; Glover, J.; Ghalandari, Demian Gholipour

doi:10.18653/v1/w19-5319

Cited by 11 publications

(12 citation statements)

References 36 publications

(43 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Pre-trained embedding is trained on monolingual data for 5 iterations and used as an initialization for the RNN model. (Hokamp et al, 2019) The Aylien research team built a Multilingual NMT system which is trained on all WMT2019 language pairs in all directions, then fine-tuned for a small number of iterations on Gujarati-English data only, including some self-backtranslated data.…”

Section: Apprentice-c (Li and Specia 2019)mentioning

confidence: 99%

See 1 more Smart Citation

Findings of the 2019 Conference on Machine Translation (WMT19)

Barrault¹,

Bojar²,

Costa-jussà³

et al. 2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

333

261

View full text Add to dashboard Cite

This paper presents the results of the premier shared task organized alongside the Conference on Machine Translation (WMT) 2019. Participants were asked to build machine translation systems for any of 18 language pairs, to be evaluated on a test set of news stories. The main metric for this task is human judgment of translation quality. The task was also opened up to additional test suites to probe specific aspects of translation.

show abstract

Section: Apprentice-c (Li and Specia 2019)mentioning

confidence: 99%

“…Air Force Research Laboratory ) APERTIUM-FIN-ENG Apertium (Pirinen, 2019) APPRENTICE-C Apprentice (Li and Specia, 2019) AYLIEN_MULTILINGUAL Aylien Ltd. (Hokamp et al, 2019) BAIDU Baidu (Sun et al, 2019) BTRANS (no associated paper)…”

Section: Afrlmentioning

confidence: 99%

Findings of the 2019 Conference on Machine Translation (WMT19)

Barrault¹,

Bojar²,

Costa-jussà³

et al. 2019

Proceedings of the Fourth Conference on Machine Translation (Volume 2: Shared Task Papers, Day 1)

333

261

View full text Add to dashboard Cite

show abstract

“…By minimizing the diversity of representations, the decoder's task is simplified and it becomes better at language generation. The choice of a single encoder for all languages is also promoted by Hokamp et al [64], who opt for language-specific decoders. Murthy et al [101] pointed out that the sentence representations generated by the encoder are dependent on the word order of the language and are, hence, language-specific.…”

Section: Addressing Language Divergencementioning

confidence: 99%

“…This shows that dedicating a few parameters to learn language tokens can help a decoder maintain a balance between language-agnostic and language-distinct features. Hokamp et al [64] showed that more often than not, using separate decoders and attention mechanisms gives better results as compared to a shared decoder and attention mechanism. This work implies that the best way to handle language divergence would be to use a shared encoder for source languages and different decoders for target languages.…”

Section: Addressing Language Divergencementioning

confidence: 99%

“…Each of these methods provides gains over Johnson et al [70], and combining all methods gave the best results. Hokamp et al [64] showed that in a shared decoder setting, using a taskspecific (language pair to be translated) embedding works better than using language tokens. We expect that this is because learning task-specific embeddings needs more parameters and helps the decoder learn better features to distinguish between tasks.…”

Section: Addressing Language Divergencementioning

confidence: 99%

See 1 more Smart Citation

A Survey of Multilingual Neural Machine Translation

2020

View full text Add to dashboard Cite

We present a survey on multilingual neural machine translation (MNMT), which has gained a lot of traction in recent years. MNMT has been useful in improving translation quality as a result of translation knowledge transfer (transfer learning). MNMT is more promising and interesting than its statistical machine translation counterpart, because end-to-end modeling and distributed representations open new avenues for research on machine translation. Many approaches have been proposed to exploit multilingual parallel corpora for improving translation quality. However, the lack of a comprehensive survey makes it difficult to determine which approaches are promising and, hence, deserve further exploration. In this article, we present an indepth survey of existing literature on MNMT. We first categorize various approaches based on their central use-case and then further categorize them based on resource scenarios, underlying modeling principles, coreissues, and challenges. Wherever possible, we address the strengths and weaknesses of several techniques by comparing them with each other. We also discuss the future directions for MNMT. This article is aimed towards both beginners and experts in NMT. We hope this article will serve as a starting point as well as a source of new ideas for researchers and engineers interested in MNMT.

show abstract