2020
DOI: 10.1609/aaai.v34i05.6319
|View full text |Cite
|
Sign up to set email alerts
|

QASC: A Dataset for Question Answering via Sentence Composition

Abstract: Composing knowledge from multiple pieces of texts is a key challenge in multi-hop question answering. We present a multi-hop reasoning dataset, Question Answering via Sentence Composition (QASC), that requires retrieving facts from a large corpus and composing them to answer a multiple-choice question. QASC is the first dataset to offer two desirable properties: (a) the facts to be composed are annotated in a large corpus, and (b) the decomposition into these facts is not evident from the question itself. The … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
152
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 120 publications
(154 citation statements)
references
References 18 publications
2
152
0
Order By: Relevance
“…Multiple-choice QA (MC). We use the following MC datasets: MCTest (Richardson et al, 2013), RACE (Lai et al, 2017), OpenBookQA/OBQA (Mihaylov et al, 2018), ARC (Clark et al, , 2016, QASC (Khot et al, 2019), CommonsenseQA/CQA , PIQA (Bisk et al, 2020), SIQA (Sap et al, 2019), and Winogrande (Sakaguchi et al, 2020). Several of the MC datasets do not come with accompanying paragraphs (such as ARC, QASC, OBQA).…”
Section: Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…Multiple-choice QA (MC). We use the following MC datasets: MCTest (Richardson et al, 2013), RACE (Lai et al, 2017), OpenBookQA/OBQA (Mihaylov et al, 2018), ARC (Clark et al, , 2016, QASC (Khot et al, 2019), CommonsenseQA/CQA , PIQA (Bisk et al, 2020), SIQA (Sap et al, 2019), and Winogrande (Sakaguchi et al, 2020). Several of the MC datasets do not come with accompanying paragraphs (such as ARC, QASC, OBQA).…”
Section: Datasetsmentioning
confidence: 99%
“…The first few rows of the table show T5 models trained for individual formats, followed by UNI-FIEDQA. For completeness, we include the highest previous scores for each dataset; one must be careful when reading these numbers as the best previous numbers follow the fully supervised protocol (for NewsQA (Zhang et al, 2020), Quoref (Segal et al, 2019), DROP (Lan et al, 2019), ROPES (Lin et al, 2019), QASC (Khot et al, 2019), CommonsenseQA (Zhu et al, 2020) and x-CS datasets (Gardner et al, 2020). )…”
Section: Generalization To Unseen Datasetsmentioning
confidence: 99%
“…2017), we extend their architecture with a hierarchy-like structure of bidirectional LSTM (BiLSTM) layers with max pooling. All in all, our model improves the previous state of the art for SciTail (Khot, Sabharwal, and Clark 2018) and achieves strong results for the Stanford Natural Language Inference (SNLI) and Multi-Genre Natural Language Inference corpus (MultiNLI; Williams, Nangia, and Bowman 2018).…”
Section: Introductionmentioning
confidence: 55%
“…SciTail: SciTail (Khot et al . 2018) is an NLI dataset created from multiple-choice science exams consisting of 27k sentence pairs. Each question and the correct answer choice have been converted into an assertive statement to form the hypothesis.…”
Section: Evaluation Benchmarksmentioning
confidence: 99%
See 1 more Smart Citation