MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension

Fisch, Adam; Talmor, Alon; Jia, Robin; Seo, Minjoon; Choi, Eunsol; Chen, Danqi

doi:10.18653/v1/d19-5801

Cited by 197 publications

(248 citation statements)

References 34 publications

Supporting

Mentioning

218

Contrasting

Order By: Relevance

“…Figure 2 shows quantitatively, the discrepancy between predicting the correct answer text versus predicting the correct answer-span. Using BERT trained on curated Nat-uralQuestions (Fisch et al, 2019), we show the results of extractive QA task using exact match (EM) and Span-EM. EM only looks for the text to match the ground truth answer, whereas Span-EM additionally requires the span to be the same as the ground truth answer-span.…”

Section: Introductionmentioning

confidence: 99%

Context-Aware Answer Extraction in Question Answering

Seonwoo¹,

Kim²,

Ha³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

Extractive QA models have shown very promising performance in predicting the correct answer to a question for a given passage. However, they sometimes result in predicting the correct answer text but in a context irrelevant to the given question. This discrepancy becomes especially important as the number of occurrences of the answer text in a passage increases. To resolve this issue, we propose BLANC (BLock AttentioN for Context prediction) based on two main ideas: context prediction as an auxiliary task in multi-task learning manner, and a block attention method that learns the context prediction task. With experiments on reading comprehension, we show that BLANC outperforms the state-ofthe-art QA models, and the performance gap increases as the number of answer text occurrences increases. We also conduct an experiment of training the models using SQuAD and predicting the supporting facts on HotpotQA and show that BLANC outperforms all baseline models in this zero-shot setting.

show abstract

Section: Introductionmentioning

confidence: 99%

Context-Aware Answer Extraction in Question Answering

Seonwoo¹,

Kim²,

Ha³

et al. 2020

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

show abstract

“…With this derived dataset, they test the model capability of unsupervised domain adaptation. Fisch et al [12] presented Machine Reading for Question Answering (MRQA) 2019 shared task, which tested extractive MRC models on their ability to generalize to data distributions different from the training distribution. They unified 18 distinct question answering datasets into the uniform format.…”

Section: Derivedmentioning

confidence: 99%

“…Evaluation of CDA is to test the overall performance on all the domains that the MRC model has encountered. As we knew, there have been a large variety of MRC tasks [12] proposed in the literature. However, all those tasks assume a stationary learning scenario, i.e., a fixed data distribution.…”

Section: Introductionmentioning

confidence: 99%

Continual Domain Adaptation for Machine Reading Comprehension

Guo

Zhang

et al. 2020

Proceedings of the 29th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

Machine reading comprehension (MRC) has become a core component in a variety of natural language processing (NLP) applications such as question answering and dialogue systems. It becomes a practical challenge that an MRC model needs to learn in non-stationary environments, in which the underlying data distribution changes over time. A typical scenario is the domain drift, i.e. different domains of data come one after another, where the MRC model is required to adapt to the new domain while maintaining previously learned ability. To tackle such a challenge, in this work, we introduce the Continual Domain Adaptation (CDA) task for MRC. So far as we know, this is the first study on the continual learning perspective of MRC. We build two benchmark datasets for the CDA task, by reorganizing existing MRC collections into different domains with respect to context type and question type, respectively. We then analyze and observe the catastrophic forgetting (CF) phenomenon of MRC under the CDA setting. To tackle the CDA task, we propose several BERT-based continual learning MRC models using either regularization-based methodology or dynamic-architecture paradigm. We analyze the performance of different continual learning MRC models under the CDA task and show that the proposed dynamic-architecture based model achieves the best performance. CCS CONCEPTS • Information systems → Question answering.

show abstract

“…For these slots, both extractive and categorical DST models can be applied as shown in Table 1. (Fisch et al, 2019) that was focused on extractive question answering. MRQA contains six distinct datasets across different domains: SQuAD, NewsQA, TriviaQA, SearchQA, HotpotQA, and NaturalQuestions.…”

Section: Multiple-choice Reading Comprehension To Categorical Dialogumentioning

confidence: 99%

Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

2020

View full text Add to dashboard Cite

One of the core components of voice assistants is the Natural Language Understanding (NLU) model. Its ability to accurately classify the user's request (or "intent") and recognize named entities in an utterance is pivotal to the success of these assistants. NLU models can be challenged in some languages by code-switching or morphological and orthographic variations. This work explores the possibility of improving the accuracy of NLU models for Indic languages via the use of alternate representations of input text for NLU, specifically ISO-15919 and IndicSOUNDEX, a custom SOUNDEX designed to work for Indic languages. We used a deep neural network based model to incorporate the information from alternate representations into the NLU model. We show that using alternate representations significantly improves the overall performance of NLU models when the amount of training data is limited.

show abstract

MRQA 2019 Shared Task: Evaluating Generalization in Reading Comprehension

Cited by 197 publications

References 34 publications

Context-Aware Answer Extraction in Question Answering

Context-Aware Answer Extraction in Question Answering

Continual Domain Adaptation for Machine Reading Comprehension

Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI

Contact Info

Product

Resources

About