Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen 2019
DOI: 10.18653/v1/d19-1606
|View full text |Cite
|
Sign up to set email alerts
|

Quoref: A Reading Comprehension Dataset with Questions Requiring Coreferential Reasoning

Abstract: Machine comprehension of texts longer than a single sentence often requires coreference resolution. However, most current reading comprehension benchmarks do not contain complex coreferential phenomena and hence fail to evaluate the ability of models to resolve coreference. We present a new crowdsourced dataset containing more than 24K span-selection questions that require resolving coreference among entities in over 4.7K English paragraphs from Wikipedia. Obtaining questions focused on such phenomena is chall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
67
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 111 publications
(67 citation statements)
references
References 21 publications
0
67
0
Order By: Relevance
“…Extractive QA (EX). Among the datasets in this popular format, we adopt SQuAD 1.1 (Rajpurkar et al, 2016), SQuAD 2 (Rajpurkar et al, 2018), NewsQA (Trischler et al, 2017), Quoref (Dasigi et al, 2019), ROPES (Lin et al, 2019).…”
Section: Datasetsmentioning
confidence: 99%
“…Extractive QA (EX). Among the datasets in this popular format, we adopt SQuAD 1.1 (Rajpurkar et al, 2016), SQuAD 2 (Rajpurkar et al, 2018), NewsQA (Trischler et al, 2017), Quoref (Dasigi et al, 2019), ROPES (Lin et al, 2019).…”
Section: Datasetsmentioning
confidence: 99%
“…To make our observations and conclusions as general as possible, we experiment over a diverse range of QA datasets with broad domain coverage over questions regarding both factual and commonsense knowledge (Khashabi et al, 2020;Hendrycks et al, 2020;Rajpurkar et al, 2016Rajpurkar et al, , 2018Trischler et al, 2017;Dasigi et al, 2019;Lin et al, 2019;Richardson et al, 2013;Lai et al, 2017;Mihaylov et al, 2018;Talmor et al, 2019b;Bisk et al, 2020;Sakaguchi et al, 2020). We list all the datasets we used in Table 2 and their corresponding domain.…”
Section: Lm-based Question Answeringmentioning
confidence: 99%
“…ing comprehension tasks, such as SQuAD 2.0 (Rajpurkar et al, 2018), DROP (Dua et al, 2019b), or Quoref (Dasigi et al, 2019), evaluate models using a relatively simpler setup where all the information required to answer the questions (including judging them as being unanswerable) is provided in the associated contexts. While this setup has led to significant advances in reading comprehension (Ran et al, 2019;Zhang et al, 2020), the tasks are still limited since they do not evaluate the capability of models at identifying precisely what information, if any, is missing to answer a question, and where that information might be found.…”
Section: Introductionmentioning
confidence: 99%