emrQA: A Large Corpus for Question Answering on Electronic Medical Records

Pampari, Anusri; Raghavan, Preethi; Liang, Jennifer J.; Peng, Jian

doi:10.48550/arxiv.1809.00732

Cited by 9 publications

(8 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several large-scale automatically collected biomedical QA datasets have been introduced: emrQA (Pampari et al, 2018) is an extractive QA dataset for electronic medical records (EHR) built by re-purposing existing annotations on EHR corpora. BioRead (Pappas et al, 2018) and BMKC (Kim et al, 2018) both collect cloze-style QA instances by masking biomedical named entities in sentences of research articles and using other parts of the same article as context.…”

Section: Related Workmentioning

confidence: 99%

“…Yang et al, 2018;, the largest annotated biomedical QA dataset, BioASQ (Tsatsaronis et al, 2015) has less than 3k training instances, most of which are simple factual questions. Some works proposed automatically constructed biomedical QA datasets (Pampari et al, 2018;Pappas et al, 2018;Kim et al, 2018), which have much larger sizes. However, questions of these datasets are mostly factoid, whose answers can be extracted in the contexts without much reasoning.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

PubMedQA: A Dataset for Biomedical Research Question Answering

Jin¹,

Dhingra²,

Liu³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

182

134

View full text Add to dashboard Cite

We introduce PubMedQA, a novel biomedical question answering (QA) dataset collected from PubMed abstracts. The task of Pub-MedQA is to answer research questions with yes/no/maybe (e.g.: Do preoperative statins reduce atrial fibrillation after coronary artery bypass grafting?) using the corresponding abstracts. PubMedQA has 1k expert-annotated, 61.2k unlabeled and 211.3k artificially generated QA instances. Each PubMedQA instance is composed of (1) a question which is either an existing research article title or derived from one, (2) a context which is the corresponding abstract without its conclusion, (3) a long answer, which is the conclusion of the abstract and, presumably, answers the research question, and (4) a yes/no/maybe answer which summarizes the conclusion. Pub-MedQA is the first QA dataset where reasoning over biomedical research texts, especially their quantitative contents, is required to answer the questions. Our best performing model, multi-phase fine-tuning of BioBERT with long answer bag-of-word statistics as additional supervision, achieves 68.1% accuracy, compared to single human performance of 78.0% accuracy and majority-baseline of 55.2% accuracy, leaving much room for improvement. PubMedQA is publicly available at https://pubmedqa.github.io.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

PubMedQA: A Dataset for Biomedical Research Question Answering

Jin¹,

Dhingra²,

Liu³

et al. 2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

182

134

View full text Add to dashboard Cite

show abstract

“…In the clinical context, QA system answers physicians' questions by understanding the clinical narratives extracted from electronic health record systems to support decision making. emrQA [24] is the most frequently used benchmark dataset in clinical QA, which contains more than 400,000 question-answer pairs semi-automatically generated from past i2b2 challenges. emrQA falls into the category of extractive question answering that aims at identifying the answer spans from reference contexts instead of generating answers in a word-by-word fashion.…”

Section: Question Answeringmentioning

confidence: 99%

Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences

Li¹,

Wehbe²,

Ahmad³

et al. 2022

Preprint

View full text Add to dashboard Cite

Transformers-based models, such as BERT, have dramatically improved the performance for various natural language processing tasks. The clinical knowledge enriched model, namely Clinical-BERT, also achieved state-of-the-art results when performed on clinical named entity recognition and natural language inference tasks. One of the core limitations of these transformers is the substantial memory consumption due to their full self-attention mechanism. To overcome this, long sequence transformer models, e.g. Longformer and BigBird, were proposed with the idea of sparse attention mechanism to reduce the memory usage from quadratic to the sequence length to a linear scale. These models extended the maximum input sequence length from 512 to 4096, which enhanced the ability of modeling long-term dependency and consequently achieved optimal results in a variety of tasks. Inspired by the success of these long sequence transformer models, we introduce two domain enriched language models, namely Clinical-Longformer and Clinical-BigBird, which are pre-trained from large-scale clinical corpora. We evaluate both pre-trained models using 10 baseline tasks including named entity recognition, question answering, and document classification tasks. The results demonstrate that Clinical-Longformer and Clinical-BigBird consistently and significantly outperform ClinicalBERT as well as other short-sequence transformers in all downstream tasks. We have made the pre-trained models available for public download at: https://huggingface.co/yikuan8/Clinical-Longformer.

show abstract

“…The emrQA [6] is a large training set annotated for RCQA in the clinical domain. It was generated by template-based semantic extraction from the i2b2 NLP challenge datasets [7].…”

Section: Squadmentioning

confidence: 99%

Adapting and evaluating a deep learning language model for clinical why-question answering

et al. 2020

View full text Add to dashboard Cite

ObjectivesTo adapt and evaluate a deep learning language model for answering why-questions based on patient-specific clinical text. Materials and MethodsBidirectional encoder representations from transformers (BERT) models were trained with varying data sources to perform SQuAD 2.0 style why-question answering (why-QA) on clinical notes. The evaluation focused on: 1) comparing the merits from different training data, 2) error analysis. ResultsThe best model achieved an accuracy of 0.707 (or 0.760 by partial match). Training toward customization for the clinical language helped increase 6% in accuracy. DiscussionThe error analysis suggested that the model did not really perform deep reasoning and that clinical why-QA might warrant more sophisticated solutions. ConclusionThe BERT model achieved moderate accuracy in clinical why-QA and should benefit from the rapidly evolving technology. Despite the identified limitations, it could serve as a competent proxy for question-driven clinical information extraction.

show abstract

emrQA: A Large Corpus for Question Answering on Electronic Medical Records

Cited by 9 publications

References 0 publications

PubMedQA: A Dataset for Biomedical Research Question Answering

PubMedQA: A Dataset for Biomedical Research Question Answering

Clinical-Longformer and Clinical-BigBird: Transformers for long clinical sequences

Adapting and evaluating a deep learning language model for clinical why-question answering

Contact Info

Product

Resources

About