Proceedings of the 2017 ACM on Conference on Information and Knowledge Management 2017
DOI: 10.1145/3132847.3132989
|View full text |Cite
|
Sign up to set email alerts
|

A Neural Candidate-Selector Architecture for Automatic Structured Clinical Text Annotation

Abstract: We consider the task of automatically annotating free texts describing clinical trials with concepts from a controlled, structured medical vocabulary. Specifically we aim to build a model to infer distinct sets of (ontological) concepts describing complementary clinically salient aspects of the underlying trials: the populations enrolled, the interventions administered and the outcomes measured, i.e., the PICO elements. This important practical problem poses a few key challenges. One issue is that the output s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 12 publications
(24 citation statements)
references
References 16 publications
0
24
0
Order By: Relevance
“…32 describes removing sentence headings in structured abstracts in order to avoid creating a system biased towards common terms, while 63 discussed abbreviations and grammar as factors influencing the results. Length of input text 34 and position of a sentence within a paragraph or abstract, e. g. up to 10% lower classification scores for certain sentence combinations in unstructured abstracts, were shown in several publications. 30,37,72 3.4.5.3 Is the process of avoiding overfitting or underfitting described?…”
Section: Are Explanations For the Influence Of Both Visible And Hidden Variables In The Dataset Given?mentioning
confidence: 97%
See 1 more Smart Citation
“…32 describes removing sentence headings in structured abstracts in order to avoid creating a system biased towards common terms, while 63 discussed abbreviations and grammar as factors influencing the results. Length of input text 34 and position of a sentence within a paragraph or abstract, e. g. up to 10% lower classification scores for certain sentence combinations in unstructured abstracts, were shown in several publications. 30,37,72 3.4.5.3 Is the process of avoiding overfitting or underfitting described?…”
Section: Are Explanations For the Influence Of Both Visible And Hidden Variables In The Dataset Given?mentioning
confidence: 97%
“…Most data extraction approaches focused on recognising instances of entity or sentence classes, and a small number of publications went one step further to normalise to actual concepts. 34,58 The 'Other' category includes some more detailed drug annotations 36 or information such as confounders 26 and other entity types (see the full dataset in Underlying data: Appendix A for more information 86 ).…”
Section: Data Extraction Targetsmentioning
confidence: 99%
“…: {P, I, C, O, R} (Wallace, 2019). Consequently, collecting such explicit evidence is vital for further analyses, and is also the objective for most relevant works: Some seek to find relevant papers through retrieval (Lee and Sun, 2018); many works are aimed at extracting PICO elements from published literature (Wallace et al, 2016;Singh et al, 2017;Jin and Szolovits, 2018;Nye et al, 2018;Zhang et al, 2020); the evidence inference task extracts R for a given ICO query using the corresponding clinical trial report (Lehman et al, 2019;DeYoung et al, 2020). However, since getting expert annotations is expensive, these works are typically limited in scale, with only thousands of labeled instances.…”
Section: Related Workmentioning
confidence: 99%
“…One particular challenge of this task is that evidence is entangled with other free-texts in the literature. Prior works have explored explicit methods for evidence integration through a pipeline of retrieval, extraction and inference on structured {P,I,C,O,R} evidence (Wallace et al, 2016;Singh et al, 2017;Jin and Szolovits, 2018;Lee and Sun, 2018;Nye et al, 2018;Lehman et al, 2019;DeYoung et al, 2020;Zhang et al, 2020). However, they are limited in scale since getting domain-specific supervision for all clinical evidence is prohibitively expensive.…”
Section: Introductionmentioning
confidence: 99%
“…Overall accuracy is approaching 50%, which may not yet be sufficient for global roll-out, but does represent good progress, considering how challenging this classification task is (bearing in mind there are hundreds of thousands of terms for the machine to learn from very little training data). [22] What are the data?…”
Section: What Type Of Study Is This?mentioning
confidence: 99%