2019
DOI: 10.48550/arxiv.1904.00412
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Surrogate-guided sampling designs for classification of rare outcomes from electronic medical records data

W. Katherine Tan,
Patrick J. Heagerty

Abstract: Scalable and accurate identification of specific clinical outcomes has been enabled by machine-learning applied to electronic medical record (EMR) systems. The development of automatic classification requires the collection of a complete labeled data set, where true clinical outcomes are obtained by human expert manual review. For example, the development of natural language processing algorithms requires the abstraction of clinical text data to obtain outcome information necessary for training models. However… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 16 publications
(18 reference statements)
0
1
0
Order By: Relevance
“…Ten studies use NLP to create specific cohorts for research purposes and six reported the performance of their tools. Out of these papers, the majority (n=8) created cohorts for specific medical conditions including fatty liver disease [Goldshtein et al, 2020, Redman et al, 2017 hepatocellular cancer [Sada et al, 2016], ureteric stones [Li and Elliot, 2019], vertebral facture [Tan and Heagerty, 2019], traumatic brain injury [Yadav et al, 2016, Mahan et al, 2019, and leptomeningeal disease secondary to metastatic breast cancer [Brizzi et al, 2019]. Five papers identified cohorts focused on particular radiology findings including ground glass opacities (GGO) [Van Haren et al, 2019], cerebral microbleeds (CMB) [Noorbakhsh-Sabet et al, 2018], pulmonary nodules [Gould et al, 2015], [Huhdanpaa et al, 2018], changes in the spine correlated to back pain [Bates et al, 2016] and identifying radiological evidence of people having suffered a fall.…”
Section: Cohort and Epidemiologymentioning
confidence: 99%
“…Ten studies use NLP to create specific cohorts for research purposes and six reported the performance of their tools. Out of these papers, the majority (n=8) created cohorts for specific medical conditions including fatty liver disease [Goldshtein et al, 2020, Redman et al, 2017 hepatocellular cancer [Sada et al, 2016], ureteric stones [Li and Elliot, 2019], vertebral facture [Tan and Heagerty, 2019], traumatic brain injury [Yadav et al, 2016, Mahan et al, 2019, and leptomeningeal disease secondary to metastatic breast cancer [Brizzi et al, 2019]. Five papers identified cohorts focused on particular radiology findings including ground glass opacities (GGO) [Van Haren et al, 2019], cerebral microbleeds (CMB) [Noorbakhsh-Sabet et al, 2018], pulmonary nodules [Gould et al, 2015], [Huhdanpaa et al, 2018], changes in the spine correlated to back pain [Bates et al, 2016] and identifying radiological evidence of people having suffered a fall.…”
Section: Cohort and Epidemiologymentioning
confidence: 99%