T<scp>y</scp>D<scp>i</scp> QA: A Benchmark for Information-Seeking Question Answering in <i>Ty</i>pologically <i>Di</i>verse Languages

Clark, Jonathan H.; Choi, Eunsol; Collins, Michael; Garrette, Dan; Kwiatkowski, Tom; Nikolaev, Vitaly; Palomaki, Jennimaria

doi:10.1162/tacl_a_00317

Cited by 282 publications

(318 citation statements)

References 60 publications

Supporting

Mentioning

244

Contrasting

Unclassified

Order By: Relevance

“…XNLI was created by translating examples from the English MultiNLI data set, and projecting its sentence labels (Williams, Nangia, and Bowman 2018). Other recent multilingual data sets target the task of question answering based on reading comprehension: i) MLQA (Lewis et al 2019) includes 7 languages; ii) XQuAD (Artetxe, Ruder, and Yogatama 2019) 10 languages; and iii) TyDiQA (Clark et al 2020) 9 widely spoken typologically diverse languages. While MLQA and XQuAD result from the translation from an English data set, TyDiQA was built independently in each language.…”

Section: Previous Work and Evaluation Datamentioning

confidence: 99%

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

Vulić

Baker

Ponti

et al. 2021

Computational Linguistics

View full text Add to dashboard Cite

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language data set is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pairs, providing a representative coverage of word classes (nouns, verbs, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels. Additionally, owing to the alignment of concepts across languages, we provide a suite of 66 cross-lingual semantic similarity data sets. Due to its extensive size and language coverage, Multi-SimLex provides entirely novel opportunities for experimental evaluation and analysis. On its monolingual and cross-lingual benchmarks, we evaluate and analyze a wide array of recent state-of-the-art monolingual and cross-lingual representation models, including static and contextualized word embeddings (such as fastText, monolingual and multilingual BERT, XLM), externally informed lexical representations, as well as fully unsupervised and (weakly) supervised cross-lingual word embeddings. We also present a step-by-step data set creation protocol for creating consistent, Multi-Simlex -style resources for additional languages. We make these contributions - the public release of Multi-SimLex data sets, their creation protocol, strong baseline results, and in-depth analyses which can be be helpful in guiding future developments in multilingual lexical semantics and representation learning - available via a website which will encourage community effort in further expansion of Multi-Simlex to many more languages. Such a large-scale semantic resource could inspire significant further advances in NLP across languages.

show abstract

Section: Previous Work and Evaluation Datamentioning

confidence: 99%

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

Vulić

Baker

Ponti

et al. 2021

Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…It can also be used to perform QA in current events via the CORD-19 COVID-19 (Wang et al, 2020;Tang et al, 2020) dataset. In the future we plan on experimenting with additional QA datsets such as Natural Questions (Kwiatkowski et al, 2019) and TyDiQA (Clark et al, 2020).…”

Section: Discussionmentioning

confidence: 99%

“…The MLQA dataset contains parallel instances in 7 languages where the context is found in Wikipedia. The TyDiQA (Clark et al, 2020) dataset containes instances in 11 languages. However, TyDiQA is not parallel and it only has instances where the question and context are in the same language.…”

Section: Related Workmentioning

confidence: 99%

A Multilingual Reading Comprehension System for more than 100 Languages

Ferritto

Rosenthal

Bornea

et al. 2020

Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

View full text Add to dashboard Cite

This paper presents M-GAAMA, a Multilingual Question Answering architecture and demo system. This is the first multilingual machine reading comprehension (MRC) demo which is able to answer questions in over 100 languages. M-GAAMA answers questions from a given passage in the same or a different language. It incorporates several existing multilingual models that can be used interchangeably in the demo such as M-BERT and XLM-R. The M-GAAMA demo also improves language accessibility by incorporating the IBM Watson machine translation widget to provide additional capabilities to the user to see an answer in their desired language. We also show how M-GAAMA can be used in downstream tasks by incorporating it into an END-TO-END-QA system using CFO (Chakravarti et al., 2019). We experiment with our system architecture on the Multi-Lingual Question Answering (MLQA) and the CORD-19 COVID (Wang et al., 2020;Tang et al., 2020) datasets to provide insights into the performance of the system.

show abstract

“…The English portion includes instructions by speakers in the USA (en-US) and India (en-IN). Unlike Chen and Mooney (2011) and like the TyDi-QA multilingual question answering dataset (Clark et al, 2020), RxR's instructions are not translations: all instructions are created from scratch by native speakers. This especially matters for VLN, as different languages encode spatial and temporal information in idiosyncratic ways-e.g., how contact/support relationships are expressed (Munnich et al, 2001), frame of reference (Haun et al, 2011), and how temporal accounts are expressed (Bender and Beller, 2014).…”

Section: Motivationmentioning

confidence: 99%

Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Ku¹,

Anderson²,

Patel³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

138

150

View full text Add to dashboard Cite

We introduce Room-Across-Room (RxR), a new Vision-and-Language Navigation (VLN) dataset. RxR is multilingual (English, Hindi, and Telugu) and larger (more paths and instructions) than other VLN datasets. It emphasizes the role of language in VLN by addressing known biases in paths and eliciting more references to visible entities. Furthermore, each word in an instruction is time-aligned to the virtual poses of instruction creators and validators. We establish baseline scores for monolingual and multilingual settings and multitask learning when including Room-to-Room annotations (Anderson et al., 2018b). We also provide results for a model that learns from synchronized pose traces by focusing only on portions of the panorama attended to in human demonstrations. The size, scope and detail of RxR dramatically expands the frontier for research on embodied language agents in simulated, photo-realistic environments.

show abstract

TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages

Cited by 282 publications

References 60 publications

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

A Multilingual Reading Comprehension System for more than 100 Languages

Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Contact Info

Product

Resources

About