When and Why is Unsupervised Neural Machine Translation Useless?

Kim, Yunsu; Graça, Miguel; Ney, Hermann

doi:10.48550/arxiv.2004.10581

Cited by 10 publications

(9 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Notice that even in this data-starved setting, we still outperform the competing unsupervised models. Once we reach only 100,000 lines, performance degrades below mBART but still outperforms the bilingual UNMT approach of Kim et al (2020), revealing the power of multilinguality in low-resource settings.…”

Section: Our Approach Is Robust Under Multiple Domainsmentioning

confidence: 96%

“…Unsupervised baselines: For the bilingual unsupervised baselines, we include the results of Kim et al (2020) 9 for EnØGu and EnØKk and of for EnØSi. We also report other multilingual unsupervised baselines.…”

Section: Baselinesmentioning

confidence: 99%

“…Given the comparably sterile setups UNMT has been studied in, recent works have questioned the usefulness of UNMT when applied to more realistic low-resource settings. Kim et al (2020) report BLEU scores of less than 3.0 on low-resource pairs and Marchisio et al (2020) also report dramatic degradation under domain shift.…”

Section: Introductionmentioning

confidence: 98%

“…BLEU scores for various configurations of Gujarati monolingual data, where we vary amount of data and domain. We include the best results of mBART and(Kim et al, 2020) for comparison.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages

García¹,

Siddhant²,

Fırat³

et al. 2020

Preprint

View full text Add to dashboard Cite

Unsupervised translation has reached impressive performance on resource-rich language pairs such as English-French and English-German. However, early studies have shown that in more realistic settings involving lowresource, rare languages, unsupervised translation performs poorly, achieving less than 3.0 BLEU. In this work, we show that multilinguality is critical to making unsupervised systems practical for low-resource settings. In particular, we present a single model for 5 low-resource languages (Gujarati, Kazakh, Nepali, Sinhala, and Turkish) to and from English directions, which leverages monolingual and auxiliary parallel data from other highresource language pairs via a three-stage training scheme. We outperform all current stateof-the-art unsupervised baselines for these languages, achieving gains of up to 14.4 BLEU. Additionally, we outperform a large collection of supervised WMT submissions for various language pairs as well as match the performance of the current state-of-the-art supervised model for NeÑEn. We conduct a series of ablation studies to establish the robustness of our model under different degrees of data quality, as well as to analyze the factors which led to the superior performance of the proposed approach over traditional unsupervised models.

show abstract

Section: Our Approach Is Robust Under Multiple Domainsmentioning

confidence: 96%

Section: Baselinesmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 98%

“…BLEU scores for various configurations of Gujarati monolingual data, where we vary amount of data and domain. We include the best results of mBART and(Kim et al, 2020) for comparison.…”

mentioning

confidence: 99%

See 2 more Smart Citations

Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages

García¹,

Siddhant²,

Fırat³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Another thread of research has pursued learning MT models directly from monolingual data (Artetxe et al, 2017;Lample et al, 2018a,b;Song et al, 2019;Lewis et al, 2019). While unsupervised MT approaches have recently started get-ting close to the quality of fully supervised systems, these approaches are typically brittle, and rely on the availability of large amounts of domain matched monolingual datasets across the source and target languages (Marchisio et al, 2020;Kim et al, 2020); a luxury not available for real-world low resource languages.…”

Section: Introductionmentioning

confidence: 99%

Towards the Next 1000 Languages in Multilingual Machine Translation: Exploring the Synergy Between Supervised and Self-Supervised Learning

Siddhant¹,

Bapna²,

Fırat³

et al. 2022

Preprint

View full text Add to dashboard Cite

Achieving universal translation between all human language pairs is the holy-grail of machine translation (MT) research. While recent progress in massively multilingual MT is one step closer to reaching this goal, it is becoming evident that extending a multilingual MT system simply by training on more parallel data is unscalable, since the availability of labeled data for low-resource and non-English-centric language pairs is forbiddingly limited. To this end, we present a pragmatic approach towards building a multilingual MT model that covers hundreds of languages, using a mixture of supervised and self-supervised objectives, depending on the data availability for different language pairs. We demonstrate that the synergy between these two training paradigms enables the model to produce high-quality translations in the zero-resource setting, even surpassing supervised translation quality for low-and mid-resource languages. We conduct a wide array of experiments to understand the effect of the degree of multilingual supervision, domain mismatches and amounts of parallel and monolingual data on the quality of our self-supervised multilingual models. To demonstrate the scalability of the approach, we train models with over 200 languages and demonstrate high performance on zero-resource translation on several previously under-studied languages. We hope our findings will serve as a stepping stone towards enabling translation for the next thousand languages.

show abstract

Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems

Razumovskaia

Glavaš

Majewska

et al. 2022

jair

View full text Add to dashboard Cite

In task-oriented dialogue (ToD), a user holds a conversation with an artificial agent with the aim of completing a concrete task. Although this technology represents one of the central objectives of AI and has been the focus of ever more intense research and development efforts, it is currently limited to a few narrow domains (e.g., food ordering, ticket booking) and a handful of languages (e.g., English, Chinese). This work provides an extensive overview of existing methods and resources in multilingual ToD as an entry point to this exciting and emerging field. We find that the most critical factor preventing the creation of truly multilingual ToD systems is the lack of datasets in most languages for both training and evaluation. In fact, acquiring annotations or human feedback for each component of modular systems or for data-hungry end-to-end systems is expensive and tedious. Hence, state-of-the-art approaches to multilingual ToD mostly rely on (zero- or few-shot) cross-lingual transfer from resource-rich languages (almost exclusively English), either by means of (i) machine translation or (ii) multilingual representations. These approaches are currently viable only for typologically similar languages and languages with parallel / monolingual corpora available. On the other hand, their effectiveness beyond these boundaries is doubtful or hard to assess due to the lack of linguistically diverse benchmarks (especially for natural language generation and end-to-end evaluation). To overcome this limitation, we draw parallels between components of the ToD pipeline and other NLP tasks, which can inspire solutions for learning in low-resource scenarios. Finally, we list additional challenges that multilinguality poses for related areas (such as speech, fluency in generated text, and human-centred evaluation), and indicate future directions that hold promise to further expand language coverage and dialogue capabilities of current ToD systems.

show abstract

When and Why is Unsupervised Neural Machine Translation Useless?

Cited by 10 publications

References 34 publications

Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages

Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages

Towards the Next 1000 Languages in Multilingual Machine Translation: Exploring the Synergy Between Supervised and Self-Supervised Learning

Crossing the Conversational Chasm: A Primer on Natural Language Processing for Multilingual Task-Oriented Dialogue Systems

Contact Info

Product

Resources

About