AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages

Ebrahimi, Abteen; Mager, Manuel; Oncevay, Arturo; Chaudhary, Vishrav; Chiruzzo, Luis; Fan, Angela; Ortega, John E.; Ramos, Ricardo; Rios, Annette; Vladimir, Ivan; Gimenez-Lugo, Gustavo A.; Mager, Elisabeth; Neubig, Graham; Palmer, Alexis; Solano, Rolando A. Coto; Vu, Ngoc Thang; Kann, Katharina

doi:10.18653/v1/2022.acl-long.435

Cited by 34 publications

(29 citation statements)

References 48 publications

(39 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For POS and DP, we sample ten low-resource languages from the Universal Dependencies (UD) 2.7 dataset (Zeman et al, 2020), taking into account: 1) the availability and the size of the corresponding Wikipedia; and 2) typological diversity to ensure that different language families are covered. 3 For NLI, we rely on the recent AmericasNLI dataset (Ebrahimi et al, 2022), spanning ten low-resource languages from the Americas. For AmericasNLI languages, we use Wikipedia if available; otherwise we use the unlabelled data previously used by Ansell et al (2022).…”

Section: Experiments and Resultsmentioning

confidence: 99%

“…The adapter reduction factor (Pfeiffer et al, 2020a) is 2 for LAs and BAs and 16 for TAs. For AmericasNLI, we train its TA using the English MultiNLI data (Williams et al, 2018) following the setup of Ebrahimi et al (2022): 5 epochs with a batch size of 32, and a learning rate of 2e−5. We evaluate the TA every 625 steps and choose the one with the best English validation accuracy.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer

Parović¹,

Glavaš²,

Vulić³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Adapter modules enable modular and efficient zero-shot cross-lingual transfer, where current state-of-the-art adapter-based approaches learn specialized language adapters (LAs) for individual languages. In this work, we show that it is more effective to learn bilingual language pair adapters (BAs) when the goal is to optimize performance for a particular sourcetarget transfer direction. Our novel BAD-X adapter framework trades off some modularity of dedicated LAs for improved transfer performance: we demonstrate consistent gains in three standard downstream tasks, and for the majority of evaluated low-resource languages.

show abstract

Section: Experiments and Resultsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer

Parović¹,

Glavaš²,

Vulić³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

show abstract

“…Multilingual benchmarks or datasets are created in a variety of ways. Several benchmarks are created by translating monolingual benchmarks into different languages, usually through a professional translation service (Artetxe et al, 2020;Conneau et al, 2018;Ebrahimi et al, 2022;Lewis et al, 2020;Li et al, 2021a;FitzGerald et al, 2022;Longpre et al, 2021;Mostafazadeh et al, 2016;Zhang et al, 2019;Lin et al, 2021b;Ponti et al, 2020). Other multilingual benchmarks, instead, have been built by separately annotating each language via its native speakers (e.g.…”

Section: Generalisation Across Languagesmentioning

confidence: 99%

State-of-the-art generalisation research in NLP: A taxonomy and review

Hupkes¹,

Giulianelli²,

Dankers³

et al. 2022

Preprint

View full text Add to dashboard Cite

The ability to generalise well is one of the primary desiderata of natural language processing (NLP). Yet, what 'good generalisation' entails and how it should be evaluated is not well understood, nor are there any common standards to evaluate it. In this paper, we aim to lay the groundwork to improve both of these issues. We present a taxonomy for characterising and understanding generalisation research in NLP, we use that taxonomy to present a comprehensive map of published generalisation studies, and we make recommendations for which areas might deserve attention in the future. Our taxonomy is based on an extensive literature review of generalisation research, and contains five axes along which studies can differ: their main motivation, the type of generalisation they aim to solve, the type of data shift they consider, the source by which this data shift is obtained, and the locus of the shift within the modelling pipeline. We use our taxonomy to classify over 400 previous papers that test generalisation, for a total of more than 600 individual experiments. Considering the results of this review, we present an in-depth analysis of the current state of generalisation research in NLP, and make recommendations for the future. Along with this paper, we release a webpage where the results of our review can be dynamically explored, and which we intend to update as new NLP generalisation studies are published. With this work, we aim to make steps towards making state-of-the-art generalisation testing the new status quo in NLP.

show abstract

“…Conneau et al (2018) present XNLI, a multilingual dataset created by translating English NLI examples into other languages. The interest in multilingual NLI has resulted in the creation of some novel non-English resources such as the Korean NLI corpus (Ham et al, 2020), Chinese NLI corpus (Hu et al, 2020), Persian NLI corpus (Amirkhani et al, 2020), Indonesian NLI corpus (Mahendra et al, 2021), and indigenous languages of the Americas NLI corpus (Ebrahimi et al, 2022). For Spanish, the only available resources are the Spanish portion of XNLI and the SPARTE corpus for RTE (Peñas et al, 2006) which was adapted from Question Answering data.…”

Section: Related Workmentioning

confidence: 99%

InferES : A Natural Language Inference Corpus for Spanish Featuring Negation-Based Contrastive and Adversarial Examples

Kovatchev¹,

Taulé²

2022

Preprint

View full text Add to dashboard Cite

In this paper, we present INFERES -an original corpus for Natural Language Inference (NLI) in European Spanish. We propose, implement, and analyze a variety of corpuscreating strategies utilizing expert linguists and crowd workers. The objectives behind IN-FERES are to provide high-quality data, and, at the same time to facilitate the systematic evaluation of automated systems. Specifically, we focus on measuring and improving the performance of machine learning systems on negation-based adversarial examples and their ability to generalize across out-of-distribution topics.We train two transformer models on IN-FERES (8,055 gold examples) in a variety of scenarios. Our best model obtains 72.8% accuracy, leaving a lot of room for improvement. The "hypothesis-only" baseline performs only 2%-5% higher than majority, indicating much fewer annotation artifacts than prior work. We find that models trained on INFERES generalize very well across topics (both in-and outof-distribution) and perform moderately well on negation-based adversarial examples.

show abstract

AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages

Cited by 34 publications

References 48 publications

BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer

BAD-X: Bilingual Adapters Improve Zero-Shot Cross-Lingual Transfer

State-of-the-art generalisation research in NLP: A taxonomy and review

InferES : A Natural Language Inference Corpus for Spanish Featuring Negation-Based Contrastive and Adversarial Examples

Contact Info

Product

Resources

About