Objective To evaluate whether a natural language processing (NLP) algorithm could be adapted to extract, with acceptable validity, markers of residential instability (ie, homelessness and housing insecurity) from electronic health records (EHRs) of 3 healthcare systems. Materials and methods We included patients 18 years and older who received care at 1 of 3 healthcare systems from 2016 through 2020 and had at least 1 free-text note in the EHR during this period. We conducted the study independently; the NLP algorithm logic and method of validity assessment were identical across sites. The approach to the development of the gold standard for assessment of validity differed across sites. Using the EntityRuler module of spaCy 2.3 Python toolkit, we created a rule-based NLP system made up of expert-developed patterns indicating residential instability at the lead site and enriched the NLP system using insight gained from its application at the other 2 sites. We adapted the algorithm at each site then validated the algorithm using a split-sample approach. We assessed the performance of the algorithm by measures of positive predictive value (precision), sensitivity (recall), and specificity. Results The NLP algorithm performed with moderate precision (0.45, 0.73, and 1.0) at 3 sites. The sensitivity and specificity of the NLP algorithm varied across 3 sites (sensitivity: 0.68, 0.85, and 0.96; specificity: 0.69, 0.89, and 1.0). Discussion The performance of this NLP algorithm to identify residential instability in 3 different healthcare systems suggests the algorithm is generally valid and applicable in other healthcare systems with similar EHRs. Conclusion The NLP approach developed in this project is adaptable and can be modified to extract types of social needs other than residential instability from EHRs across different healthcare systems.
Implementation of guideline-recommended depression screening in medical oncology remains challenging. Evidence suggests that multicomponent care pathways with algorithm-based referral and management are effective, yet implementation of sustainable programs remains limited and implementation-science guided approaches are understudied.OBJECTIVE To evaluate the effectiveness of an implementation-strategy guided depression screening program for patients with breast cancer in a community setting. DESIGN, SETTING, AND PARTICIPANTSA pragmatic cluster randomized clinical trial conducted within Kaiser Permanente Southern California (KPSC). The trial included 6 medical centers and 1436 patients diagnosed with new primary breast cancer who had a consultation with medical oncology between October 1, 2017, through September 30, 2018. Patients were followed up through study end date of May 31, 2019. INTERVENTIONS Six medical centers inSouthern California participated and were randomized 1:1 to tailored implementation strategies (intervention, 3 sites, n = 744 patients) or education-only (control, 3 sites, n = 692 patients) groups. The program consisted of screening with the 9-item Patient Health Questionnaire (PHQ-9) and algorithm-based scoring and referral to behavioral health services based on low, moderate, or high score. Clinical teams at tailored intervention sites received program education, audit, and feedback of performance data and implementation facilitation, and clinical workflows were adapted to suit local context. Education-only controls sites received program education. MAIN OUTCOMES AND MEASURESThe primary outcome was percent of eligible patients screened and referred (based on PHQ-9 score) at intervention vs control groups measured at the patient level. Secondary outcomes included outpatient health care utilization for behavioral health, primary care, oncology, urgent care, and emergency department. RESULTSAll 1436 eligible patients were randomized at the center level (mean age, 61.5 years; 99% women; 18% Asian, 17% Black, 26% Hispanic, and 37% White) and were followed up to the end of the study, insurance disenrollment, or death. Groups were similar in demographic and tumor characteristics. For the primary outcome, 7.9% (59 of 744) of patients at tailored sites were referred compared with 0.1% (1 of 692) at education-only sites (difference, 7.8%; 95% CI, 5.8%-9.8%). Referrals to a behavioral health clinician were completed by 44 of 59 patients treated at the intervention sites (75%) intervention sites vs 1 of 1 patient at the education-only sites (100%). In adjusted models patients at tailored sites had significantly fewer outpatient visits in medical oncology (rate ratio, 0.86; 95% CI, 0.86-0.89; P = .001), and no significant difference in utilization of primary care, urgent care, and emergency department visits.CONCLUSIONS AND RELEVANCE Among patients with breast cancer treated in community-based oncology practices, tailored strategies for implementation of routine depression screening compared with an educa...
ImportanceIncluding race and ethnicity as a predictor in clinical risk prediction algorithms has received increased scrutiny, but there continues to be a lack of empirical studies addressing whether simply omitting race and ethnicity from the algorithms will ultimately affect decision-making for patients of minoritized racial and ethnic groups.ObjectiveTo examine whether including race and ethnicity as a predictor in a colorectal cancer recurrence risk algorithm is associated with racial bias, defined as racial and ethnic differences in model accuracy that could potentially lead to unequal treatment.Design, Setting, and ParticipantsThis retrospective prognostic study was conducted using data from a large integrated health care system in Southern California for patients with colorectal cancer who received primary treatment between 2008 and 2013 and follow-up until December 31, 2018. Data were analyzed from January 2021 to June 2022.Main Outcomes and MeasuresFour Cox proportional hazards regression prediction models were fitted to predict time from surveillance start to cancer recurrence: (1) a race-neutral model that explicitly excluded race and ethnicity as a predictor, (2) a race-sensitive model that included race and ethnicity, (3) a model with 2-way interactions between clinical predictors and race and ethnicity, and (4) separate models by race and ethnicity. Algorithmic fairness was assessed using model calibration, discriminative ability, false-positive and false-negative rates, positive predictive value (PPV), and negative predictive value (NPV).ResultsThe study cohort included 4230 patients (mean [SD] age, 65.3 [12.5] years; 2034 [48.1%] female; 490 [11.6%] Asian, Hawaiian, or Pacific Islander; 554 [13.1%] Black or African American; 937 [22.1%] Hispanic; and 2249 [53.1%] non-Hispanic White). The race-neutral model had worse calibration, NPV, and false-negative rates among racial and ethnic minority subgroups than non-Hispanic White individuals (eg, false-negative rate for Hispanic patients: 12.0% [95% CI, 6.0%-18.6%]; for non-Hispanic White patients: 3.1% [95% CI, 0.8%-6.2%]). Adding race and ethnicity as a predictor improved algorithmic fairness in calibration slope, discriminative ability, PPV, and false-negative rates (eg, false-negative rate for Hispanic patients: 9.2% [95% CI, 3.9%-14.9%]; for non-Hispanic White patients: 7.9% [95% CI, 4.3%-11.9%]). Inclusion of race interaction terms or using race-stratified models did not improve model fairness, likely due to small sample sizes in subgroups.Conclusions and RelevanceIn this prognostic study of the racial bias in a cancer recurrence risk algorithm, removing race and ethnicity as a predictor worsened algorithmic fairness in multiple measures, which could lead to inappropriate care recommendations for patients who belong to minoritized racial and ethnic groups. Clinical algorithm development should include evaluation of fairness criteria to understand the potential consequences of removing race and ethnicity for health inequities.
5 Background: Implementation of guideline-recommended distress screening in oncology remains challenging. Evidence suggests that multicomponent care pathways to identify distress severity with algorithm-based referral and management are effective, yet testing of pragmatic implementation in community settings remains limited. We conducted a pragmatic randomized trial of a distress screening program in a large healthcare system to evaluate effectiveness and simultaneously examined implementation outcomes. Methods: We designed a highly pragmatic study per the Pragmatic-Explanatory Continuum Indicator Summary-2 with adaptive workflow design. Randomization was at the medical center level (N=6); eligible patients had a new diagnosis of breast cancer (no exclusions). Eligible patients were offered the distress screening program as part of usual care: PHQ-9 screening, algorithm-based scoring and referral, referral tracking, and audit and feedback of performance data. Control sites had access to the PHQ-9 and scoring algorithm. We compared number screened, distress severity, and referral. We conducted qualitative interviews with stakeholders on implementation barriers and facilitators. Results: We enrolled 1,436 eligible patients; 692 control, 744 intervention. Groups were similar in demographic and tumor characteristics (Table); 80% of patients completed screening at intervention sites vs <1% at control sites. Of those screened at intervention sites, 10% scored in the medium/high range indicating need for referral; 94% received an appropriate referral. We conducted 20 interviews; the program was found to be highly feasible and acceptable. Conclusions: Our pragmatic, adaptive approach resulted in the large majority of patients screened and appropriately referred with a high degree of acceptability and feasibility. Our results can promote more widespread, sustained adoption of effective distress screening programs. Clinical trial information: NCT02941614. [Table: see text]
Research Objective International Classification of Diseases (ICD) coding system have codes for recording of social determinants of health (SDOH); however, documentation of non‐clinical issues in electronic health records (EHRs) is infrequent compared to medical conditions. ICD codes in EHRs for SDOH identification, therefore, may under‐report patients with social needs and risks, which makes it difficult for healthcare systems to target “high risk” patients for interventions addressing social needs. SDOH may be discussed with healthcare providers during visits and, therefore, recorded in EHR free‐text notes (a.k.a, providers' notes). These notes might provide a more accurate accounting of SDOH; however, traditional approaches for review and abstraction of patient information from medical record notes is laborious, expensive, and slow. Recent developments in text mining and natural language processing (NLP) of digitized text allows for reliable, low cost, and rapid extraction of information from EHRs. In this pilot project we evaluated whether an NLP algorithm could extract valid measures of SDOH from Epic‐based EHRs in three healthcare systems: Johns Hopkins Health System (JHHS), Kaiser Permanente Mid‐Atlantic States (KPMAS), and KP Southern California (KPSCcal). The focus of our study was residential instability (i.e., homelessness and housing insecurity). Study Design The study was conducted independently, in a parallel and coordinated framework across sites. The validation assessment and NLP algorithm logic were identical across sites; however, the “gold standard” for assessment of algorithm validity differed according to data availability. Using the EntityRuler module of spaCy 2.3 Python toolkit, we created a rule‐based NLP system made up of 61 expert‐developed patterns that, if present, would represent residential instability. Our patterns included word ‘lemmas’ and base forms to account for morphological variations (e.g., singular and plural forms) as well as substitutions of different prepositions (e.g., about and for), and synonym words (e.g., house, apartment, and home). We calibrated and then validated the algorithm using a split sample approach. Validity was assessed at each site by measures of sensitivity and specificity. Population Studied Beneficiaries ≥18 years of age during 2016 through 2019 who received care at JHHS, KPMAS, KPSCal. Principal Findings The following table presents the characteristics of the study population and performance of the NLP algorithm at each study site. JHHS KPMAS KPScal Study Population (Patient No.)~1,200,000~1,600,000~4,700,000 NLP Validation Gold Standard MethodSDOH QuestionnaireSDOH QuestionnaireSDOH ICD codes Manual AnnotationSample SizePatients/ Response No. (with/without residential Instability) 1000 (500+/ 500‐) 8197 (833+,7364‐) 300 (150+/150‐) Clinical Note No.134,06278,8259575 NLP Algorithm Performance Sensitivity0.840.610.96Specificity0.960.870.97 Conclusions The consistent performance of this NLP algorithm to identify residential instability in thre...
PURPOSE There is growing interest in using computable phenotypes or proxies to identify important clinical outcomes, such as cancer recurrence, in rich electronic health records data. However, the race/ethnicity-specific accuracies of these proxies remain unclear. We examined whether the accuracy of a proxy for colorectal cancer (CRC) recurrence differed by race/ethnicity and the possible mechanisms that drove the differences. METHODS Using data from a large integrated health care system, we identified a stratified random sample of 282 Black/African American (AA), Hispanic, and non-Hispanic White (NHW) patients with CRC who received primary treatment. Patient 5-year recurrence status was estimated using a utilization-based proxy and evaluated against the true recurrence status obtained using detailed chart review and by race/ethnicity. We used covariate-adjusted probit regression models to estimate the associations between race/ethnicity and misclassification. RESULTS The recurrence proxy had excellent overall accuracy (positive predictive value [PPV] 89.4%; negative predictive value 96.5%; mean difference in timing 1.96 months); however, accuracy varied by race/ethnicity. Compared with NHW patients, PPV was 14.9% lower (95% CI, 2.53 to 28.6) among Hispanic patients and 4.3% lower (95% CI, −4.8 to 14.8) among Black/AA patients. The proxy disproportionately inflated the 5-year recurrence incidence for Hispanic patients by 10.6% (95% CI, 4.2 to 18.2). Compared with NHW patients, proxy recurrences for Hispanic patients were almost three times as likely to have been misclassified as positive (adjusted risk ratio 2.91 [95% CI, 1.21 to 8.31]). Higher false positives among racial/ethnic minorities may be related to higher prevalence of noncancerous lung-related problems and substantial delays in primary treatment because of insufficient patient-provider communication and abnormal treatment patterns. CONCLUSION Using a proxy with worse accuracy among racial/ethnic minority patients to estimate population health may misdirect resources and support erroneous conclusions around treatment benefit for these patients.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.