Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing: System Demonstrations 2018
DOI: 10.18653/v1/d18-2018
|View full text |Cite
|
Sign up to set email alerts
|

An Interface for Annotating Science Questions

Abstract: Recent work introduces the AI2 Reasoning Challenge (ARC) and the associated ARC dataset that partitions open domain, complex science questions into an Easy Set and a Challenge Set. That work includes an analysis of 100 questions with respect to the types of knowledge and reasoning required to answer them. However, it does not include clear definitions of these types, nor does it offer information about the quality of the labels or the annotation process used. In this paper, we introduce a novel interface for h… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…It contains a dataset with 2,590 multiple-choice questions written for the primary school science exam (Clark et al, 2018). In this competition, Boratko et al (2018aBoratko et al ( , 2018b verified the rewritten query effect on the pretrained DrQA model (Chen et al, 2017), and the result proved that the score increased by 0.42, thus quantitatively verifying the validity of the query rewriting. Musa et al (2018) is based on the Seq2seq model and the NCRF model, supplemented by word vectors pretrained on the knowledge graph as a priori knowledge to generate multiple new queries by identifying key items from the OQ.…”
Section: Related Work 21 Query Rewritingmentioning
confidence: 71%
“…It contains a dataset with 2,590 multiple-choice questions written for the primary school science exam (Clark et al, 2018). In this competition, Boratko et al (2018aBoratko et al ( , 2018b verified the rewritten query effect on the pretrained DrQA model (Chen et al, 2017), and the result proved that the score increased by 0.42, thus quantitatively verifying the validity of the query rewriting. Musa et al (2018) is based on the Seq2seq model and the NCRF model, supplemented by word vectors pretrained on the knowledge graph as a priori knowledge to generate multiple new queries by identifying key items from the OQ.…”
Section: Related Work 21 Query Rewritingmentioning
confidence: 71%
“…For each question, we provided English translations as not all annotators were native speakers of the questions' language. We followed the procedure and re-used the annotation types presented in earlier work Boratko et al, 2018). However, as they were designed mainly for Nature Science questions, we extended them with two new annotation types: "Domain Facts and Knowledge" and "Negation" (see Appendix C for examples).…”
Section: Reasoning and Knowledge Typesmentioning
confidence: 99%
“…For our reasoning and knowledge type annotations, we followed the procedure and re-used the annotation types presented in Boratko et al, 2018). However, as they were designed mainly for Natural Science questions, we had to extend them with two new types:…”
Section: Reasoning and Knowledge Typesmentioning
confidence: 99%
“…In contrast, domain-agnostic AQC is applied in information query or dialogue interactions in which the class labels may comprise question types (e.g., true/false, procedural) [7] or reasoning capabilities (e.g., multi-hop, comparison, algebraic) [8,9]. To enhance the effectiveness of deliberate practice [10], assessment questions are classified into their re-Chapter 1: Introduction spective cognitive complexities (e.g., synthesis, evaluation) for instructors to determine learners' proficiencies [11][12][13][14].…”
Section: Motivationmentioning
confidence: 99%
“…Questions have also been labeled according to reasoning abilities. The ARC dataset of the AI2 Reasoning Challenge has been annotated by subject-matter experts according to several knowledge and reasoning types [8,9]. Due to overlapping categories, questions belonging to three mutually exclusive class labels (Basic facts, Linguistic matching, Hypothetical ) have been selected.…”
Section: Topic Regularization Mechanismmentioning
confidence: 99%