Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2018
DOI: 10.18653/v1/p18-1019
|View full text |Cite
|
Sign up to set email alerts
|

A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature

Abstract: We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials. Annotations include demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured (the ‘PICO’ elements). These spans are further annotated at a more granular level, e.g., individual interventions within them are marked and mapped onto a structured medical vocabulary. We acquired annotations f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

5
209
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 141 publications
(214 citation statements)
references
References 32 publications
(38 reference statements)
5
209
0
Order By: Relevance
“…The mean of the Population (P) sentence scores is significantly lower than that for other types of sentences (I and O), again indicating that they are easier on average to annotate. This aligns with a previous finding that annotating Interventions and Outcomes is more difficult than annotating Participants (Nye et al, 2018).…”
Section: Quantifying Task Difficultysupporting
confidence: 92%
See 3 more Smart Citations
“…The mean of the Population (P) sentence scores is significantly lower than that for other types of sentences (I and O), again indicating that they are easier on average to annotate. This aligns with a previous finding that annotating Interventions and Outcomes is more difficult than annotating Participants (Nye et al, 2018).…”
Section: Quantifying Task Difficultysupporting
confidence: 92%
“…We again use LSTM-CRF-Pattern as the base model and experimenting on the EBM-NLP corpus (Nye et al, 2018). This is trained on either (1) the training set with difficult sentences removed, or (2) the full training set but with instances reweighted in proportion to their predicted difficulty score.…”
Section: Better Ie With Difficulty Predictionmentioning
confidence: 99%
See 2 more Smart Citations
“…In the following, we refer to these data as the ebm-nlp corpus. (Nye et al, 2018). The ebm-nlp corpus provided us with 5000 tokenized and annotated RCT abstracts for training, and 190 expert-annotated abstracts for testing.…”
Section: Ebm-nlpmentioning
confidence: 99%