AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages

Ebrahimi, Abteen; Mager, Manuel; Oncevay, Arturo; Chaudhary, Vishrav; Chiruzzo, Luis; Fan, Angela; Ortega, John; Ramos, Ricardo; Rios, Annette; Meza, Iván; Gimenez-Lugo, Gustavo A.; Mager, Elisabeth; Neubig, Graham; Palmer, Alexis; Coto-Solano, Rolando; Vu, Ngoc Thang; Kann, Katharina

doi:10.48550/arxiv.2104.08726

Cited by 1 publication

(2 citation statements)

References 25 publications

(33 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most of these languages are spoken by millions of people, despite being considered low-resource in the research community. (Ebrahimi et al, 2021) 10 ALT (Riza et al, 2016) 13 Europarl (Koehn, 2005) 21 TICO-19 (Anastasopoulos et al, 2020) 36 OPUS-100 (Zhang et al, 2020) 100 M2M 100…”

Section: Languages In Flores-101mentioning

confidence: 99%

“…At present, there are very few benchmarks on low-resource languages. These often have very low coverage of low-resource languages (Riza et al, 2016;Thu et al, 2016;Barrault et al, 2020b;∀ et al, 2020;Ebrahimi et al, 2021;Kuwanto et al, 2021), limiting our understanding of how well methods generalize and scale to a larger number of languages with a diversity of linguistic features. There are some benchmarks that have high coverage, but these are often in specific domains, like COVID-19 (Anastasopoulos et al, 2020) or religious texts (Christodouloupoulos and Steedman, 2015;Malaviya et al, 2017;Tiedemann, 2018;Agić and Vulić, 2019); or have low quality because they are built using automatic approaches (Zhang et al, 2020;Schwenk et al, 2019Schwenk et al, , 2021.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

Goyal¹,

Gao²,

Chaudhary³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are multilingually aligned. By publicly releasing such a highquality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.

show abstract

Section: Languages In Flores-101mentioning

confidence: 99%