2019
DOI: 10.48550/arxiv.1907.05019
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges

Abstract: We introduce our efforts towards building a universal neural machine translation (NMT) system capable of translating between any language pair. We set a milestone towards this goal by building a single massively multilingual NMT model handling 103 languages trained on over 25 billion examples. Our system demonstrates effective transfer learning ability, significantly improving translation quality of low-resource languages, while keeping high-resource language translation quality on-par with competitive bilingu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
168
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 140 publications
(198 citation statements)
references
References 104 publications
3
168
0
Order By: Relevance
“…DOCmT5-5 significantly outperforms Doc-NMT and DocTLM, showing that our proposed pretraining objective leads to improved cross-lingual learning. The results of DOCmT5-25 are inferior to DOCmT5-5 and this is possibly due to capacity dilution (Arivazhagan et al, 2019). As we increase the capacity, we see that DOCmT5-25-Large outperforms DOCmT5-5-Large.…”
Section: Results On Seen Language Pairsmentioning
confidence: 79%
“…DOCmT5-5 significantly outperforms Doc-NMT and DocTLM, showing that our proposed pretraining objective leads to improved cross-lingual learning. The results of DOCmT5-25 are inferior to DOCmT5-5 and this is possibly due to capacity dilution (Arivazhagan et al, 2019). As we increase the capacity, we see that DOCmT5-25-Large outperforms DOCmT5-5-Large.…”
Section: Results On Seen Language Pairsmentioning
confidence: 79%
“…In spite of the aforementioned near-human results on translation or understanding of languages from the world's economic and political superpowers, the experience of any NLP practicioner is that, for the vast majority of languages, they fall far below such standards. Critically, the languages of the world showcase substantial amounts of variation in most domains of description, and in fact, the performance of language technologies has been shown to be sensitive to diverse aspects of the language under study, including morphology, word order, or phonological repertoire, as well as more mundane aspects like data availability (Tsarfaty et al, 2020;Xia et al, 2020;Arivazhagan et al, 2019). Hence, the transfer of NLP developments from one language to another is far from trivial, as it often means that building highly functional language technologies on any particular language is a non-automatic, costly, and technically challenging task.…”
Section: Introductionmentioning
confidence: 99%
“…Multilingual learning has the potential of crosslingual transfer, allowing low-resource languages to benefit from high-resource data when trained together (Conneau et al, 2019). However, in practice, this positive transfer is often mitigated by interference between languages (Arivazhagan et al, 2019;Tan et al, 2019;Zhang et al, 2020). This is because all languages, irrespective of the amount of data, are trained with a fixed model capacity , leading to insufficient specialized capacity.…”
Section: Introductionmentioning
confidence: 99%
“…We propose two straightforward techniques to improve BASELayers-based sparse architectures (Lewis et al, 2021) for multitask learning: first, we slowly ramp the number of instances from low-resource tasks over epochs rather than having a fixed sampling ratio (Arivazhagan et al, 2019). This promotes cross-lingual transfer and reduces over-fitting as the model witnesses low-resource task instances in the later epochs.…”
Section: Introductionmentioning
confidence: 99%