2018
DOI: 10.48550/arxiv.1809.00732
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

emrQA: A Large Corpus for Question Answering on Electronic Medical Records

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(8 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…Several large-scale automatically collected biomedical QA datasets have been introduced: emrQA (Pampari et al, 2018) is an extractive QA dataset for electronic medical records (EHR) built by re-purposing existing annotations on EHR corpora. BioRead (Pappas et al, 2018) and BMKC (Kim et al, 2018) both collect cloze-style QA instances by masking biomedical named entities in sentences of research articles and using other parts of the same article as context.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Several large-scale automatically collected biomedical QA datasets have been introduced: emrQA (Pampari et al, 2018) is an extractive QA dataset for electronic medical records (EHR) built by re-purposing existing annotations on EHR corpora. BioRead (Pappas et al, 2018) and BMKC (Kim et al, 2018) both collect cloze-style QA instances by masking biomedical named entities in sentences of research articles and using other parts of the same article as context.…”
Section: Related Workmentioning
confidence: 99%
“…Yang et al, 2018;, the largest annotated biomedical QA dataset, BioASQ (Tsatsaronis et al, 2015) has less than 3k training instances, most of which are simple factual questions. Some works proposed automatically constructed biomedical QA datasets (Pampari et al, 2018;Pappas et al, 2018;Kim et al, 2018), which have much larger sizes. However, questions of these datasets are mostly factoid, whose answers can be extracted in the contexts without much reasoning.…”
Section: Introductionmentioning
confidence: 99%
“…In the clinical context, QA system answers physicians' questions by understanding the clinical narratives extracted from electronic health record systems to support decision making. emrQA [24] is the most frequently used benchmark dataset in clinical QA, which contains more than 400,000 question-answer pairs semi-automatically generated from past i2b2 challenges. emrQA falls into the category of extractive question answering that aims at identifying the answer spans from reference contexts instead of generating answers in a word-by-word fashion.…”
Section: Question Answeringmentioning
confidence: 99%
“…The emrQA [6] is a large training set annotated for RCQA in the clinical domain. It was generated by template-based semantic extraction from the i2b2 NLP challenge datasets [7].…”
Section: Squadmentioning
confidence: 99%