2019
DOI: 10.1093/jamia/ocz040
|View full text |Cite
|
Sign up to set email alerts
|

Underserved populations with missing race ethnicity data differ significantly from those with structured race/ethnicity documentation

Abstract: Objective We aimed to address deficiencies in structured electronic health record (EHR) data for race and ethnicity by identifying black and Hispanic patients from unstructured clinical notes and assessing differences between patients with or without structured race/ethnicity data. Materials and Methods Using EHR notes for 16 665 patients with encounters at a primary care practice, we developed rule-based natural language pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
42
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 53 publications
(44 citation statements)
references
References 12 publications
0
42
0
Order By: Relevance
“…Our NLP pipeline, built using the Apache Unstructured Information Management Architecture-based Leo system, was validated by a manual review of 400 notes and achieved precision of 0.885, recall of 0.939, and F score of 0.911 for classifying black patients. 10 We have described details of the rule-based approach to extract race and ethnicity entities 10 and made the code available at https://github.com/ wcmc-research-informatics/CIREX.…”
Section: Race Data Supplementation Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our NLP pipeline, built using the Apache Unstructured Information Management Architecture-based Leo system, was validated by a manual review of 400 notes and achieved precision of 0.885, recall of 0.939, and F score of 0.911 for classifying black patients. 10 We have described details of the rule-based approach to extract race and ethnicity entities 10 and made the code available at https://github.com/ wcmc-research-informatics/CIREX.…”
Section: Race Data Supplementation Methodsmentioning
confidence: 99%
“…9 This has a concrete impact on the conduct of observational research or patient cohort discovery reliant on EHR data, given that patients missing structured race data may more likely be from underserved racial groups. 10…”
Section: Background and Significancementioning
confidence: 99%
“…There was a potential for bias in recording of patient characteristics, for example, older patients are more likely to be missing ethnicity. 25 Nonetheless, apart from ethnicity, the proportions of missing data are low. Lastly, our ndings on e-consultations may not be as transferable as the other ndings.…”
Section: Strengths and Limitationsmentioning
confidence: 99%
“…WCIMA is a large, academic, hospital-based primary care practice of Weill Cornell Medicine and NewYork-Presbyterian Hospital (https://weillcornell.org/ wcima). As a high-volume tertiary-care clinic, it averages 53,000 office visits per year and serves a diverse patient population (13). At WCIMA, 31 attending physicians, 11 nurse practitioners, and six registered nurses provide care, alongside 129 residents and interns.…”
Section: Setting and Patient Populationmentioning
confidence: 99%