2019
DOI: 10.1038/s41596-019-0227-6
|View full text |Cite
|
Sign up to set email alerts
|

High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP)

Abstract: Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping using EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
81
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1

Relationship

8
1

Authors

Journals

citations
Cited by 113 publications
(92 citation statements)
references
References 44 publications
(51 reference statements)
0
81
0
Order By: Relevance
“…The primary outcome was a first ASCVD event after antihypertensive medication initiation, defined as fatal or nonfatal stroke or myocardial infarction. ASCVD events were identified using validated phenotyping algorithms that used diagnosis codes, notes, and other features in the EHR (6,7). We focused on first ASCVD events rather than cumulative incidence, including recurrent events, for two reasons: 1) to maintain comparability with recent clinical trials, and 2) to avoid changes in care after a nonfatal ASCVD event (e.g., medication intensification) that could add unmeasured confounding to the analysis of associations of BP levels in the first 2 years of treatment with subsequent events.…”
Section: Discussionmentioning
confidence: 99%
“…The primary outcome was a first ASCVD event after antihypertensive medication initiation, defined as fatal or nonfatal stroke or myocardial infarction. ASCVD events were identified using validated phenotyping algorithms that used diagnosis codes, notes, and other features in the EHR (6,7). We focused on first ASCVD events rather than cumulative incidence, including recurrent events, for two reasons: 1) to maintain comparability with recent clinical trials, and 2) to avoid changes in care after a nonfatal ASCVD event (e.g., medication intensification) that could add unmeasured confounding to the analysis of associations of BP levels in the first 2 years of treatment with subsequent events.…”
Section: Discussionmentioning
confidence: 99%
“…To identify stroke and MI, we adapted a phenotyping algorithm designed to identify prevalent cases, using a combination of ICD codes from both VA and Center for Medicare & Medicaid Services data sources, natural language processing, and medical record review labels. 12 These phenotyping methods resulted in probability of ischemic stroke and MI for each participant. A predicted probability of ischemic stroke of at least 0.770 and of MI of least 0.765 was defined as a definite case to ensure positive predictive values of at least 90% compared with expert clinician medical record review.…”
Section: Methodsmentioning
confidence: 99%
“…Recent methods for patient profiling combine more and more frequently rule-based approaches and AI/ML-based models making use of clinical text, which remains one of the most important sources of phenotype. Zhang et al, provide a detailed description of the PheCAP protocol, a high-throughput semi-supervised pipeline for phenotype generation using structured data and information extracted from the narrative notes [11]. Clinical trial recruitment is often the driving requirement for phenotyping.…”
Section: Security and Confidentialitymentioning
confidence: 99%