Know What You Don’t Know: Unanswerable Questions for SQuAD

Rajpurkar, Pranav; Jia, Robin; Liang, Percy

doi:10.18653/v1/p18-2124

Cited by 1,563 publications

(1,493 citation statements)

References 19 publications

Supporting

Mentioning

1,341

Contrasting

Order By: Relevance

“…We evaluated BERT+Entity in the natural 1 TagMe's performance on various benchmark datasets ranges from 37% to 72%. F1 (Kolitsas et al, 2018) language understanding benchmark GLUE (Wang et al, 2018), the question answering (QA) benchmarks SQUAD V2 (Rajpurkar et al, 2018) and SWAG (Zellers et al, 2018), and the machine translation benchmark EN-DE WMT14. We confirm the finding from Zhang et al (2019) that additional entity knowledge is not beneficial for the GLUE benchmark.…”

Section: Introductionmentioning

confidence: 99%

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Broscheit¹

2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 99%

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Broscheit¹

2019

Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL)

View full text Add to dashboard Cite

show abstract

“…We leverage an entailment model and a QA model based on BERT [9]. For the entailment model, as the SQuAD 2.0 dataset [35] contains unanswerable questions, we utilize it to train a classifier which tells us whether a pair of <question, answer> matches with the content in the input passage. For the question answering model, we fine-tuned another BERT-based QA model utilizing the SQuAD 1.1 dataset [36].…”

Section: Data Filtering For Quality Controlmentioning

confidence: 99%

Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus

Liu

Wei

Niu

et al. 2020

Proceedings of the Web Conference 2020

View full text Add to dashboard Cite

The ability to ask questions is important in both human and machine intelligence. Learning to ask questions helps knowledge acquisition, improves question-answering and machine reading comprehension tasks, and helps a chatbot to keep the conversation flowing with a human. Existing question generation models are ineffective at generating a large amount of high-quality question-answer pairs from unstructured text, since given an answer and an input passage, question generation is inherently a one-to-many mapping. In this paper, we propose Answer-Clue-Style-aware Question Generation (ACS-QG), which aims at automatically generating high-quality and diverse question-answer pairs from unlabeled text corpus at scale by imitating the way a human asks questions. Our system consists of: i) an information extractor, which samples from the text multiple types of assistive information to guide question generation; ii) neural question generators, which generate diverse and controllable questions, leveraging the extracted assistive information; and iii) a neural quality controller, which removes low-quality generated data based on text entailment. We compare our question generation models with existing approaches and resort to voluntary human evaluation to assess the quality of the generated question-answer pairs. The evaluation results suggest that our system dramatically outperforms state-of-the-art neural question generation models in terms of the generation quality, while being scalable in the meantime. With models trained on a relatively smaller amount of data, we can generate 2.8 million quality-assured question-answer pairs from a million sentences found in Wikipedia.

show abstract

“…We use exactly the same format as the popular SQuAD2.0 [27] dataset for our preprocessing output. We keep all questions and answers for a random sample of 25% of the documents as a separate hold-out set.…”

Section: Data Preprocessingmentioning

confidence: 99%

“…In this section, we discuss how state-of-the-art models for answer selection perform on the DQA data and DQA enhanced with data from the SQuAD2.0 dataset [27]. We select this dataset for two reasons: first, it is a standard dataset for benchmarking Question Answering tasks and, second, like DQA, it contains questions marked as unanswerable, making it closely compatible with our collected data.…”

Section: Answer Selectionmentioning

confidence: 99%

“…We hypothesize a similar difference in information needs in the context of document-assistance, motivated by the given example. This implies that document-centered assistance should critically differ from existing question answering (QA) systems, which are mostly trained to give short answers to factoid questions [e.g., 27,29]. Figure 1 gives an example of this difference.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Conversations with Documents: An Exploration of Document-Centered Assistance

Hoeve

Sim

Nouri

et al. 2020

Proceedings of the 2020 Conference on Human Information Interaction and Retrieval

View full text Add to dashboard Cite

The role of conversational assistants has become more prevalent in helping people increase their productivity. Document-centered assistance, for example to help an individual quickly review a document, has seen less significant progress, even though it has the potential to tremendously increase a user's productivity. This type of document-centered assistance is the focus of this paper. Our contributions are three-fold: (1) We first present a survey to understand the space of document-centered assistance and the capabilities people expect in this scenario. (2) We investigate the types of queries that users will pose while seeking assistance with documents, and show that document-centered questions form the majority of these queries. (3) We present a set of initial machine learned models that show that (a) we can accurately detect document-centered questions, and (b) we can build reasonably accurate models for answering such questions. These positive results are encouraging, and suggest that even greater results may be attained with continued study of this interesting and novel problem space. Our findings have implications for the design of intelligent systems to support task completion via natural interactions with documents.

show abstract

Know What You Don’t Know: Unanswerable Questions for SQuAD

Cited by 1,563 publications

References 19 publications

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Investigating Entity Knowledge in BERT with Simple Neural End-To-End Entity Linking

Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus

Conversations with Documents: An Exploration of Document-Centered Assistance

Contact Info

Product

Resources

About