Importance
Accurate, real-time case identification is needed to target interventions to improve quality and outcomes for hospitalized patients with heart failure. Problem lists may be useful for case identification, but are often inaccurate or incomplete. Machine learning approaches may improve accuracy of identification but can be limited by complexity of implementation.
Objective
To develop algorithms that use readily available clinical data to identify heart failure patients while in the hospital.
Design, Setting, and Participants
We performed a retrospective study of hospitalizations at an academic medical center. Hospitalizations for patients≥18 years who were admitted after January 1, 2013 and discharged prior to February 28, 2015 were included. From a random 75% sample of hospitalizations, we developed five algorithms for heart failure identification using electronic health record (EHR) data: 1) heart failure on problem list; 2) presence of at least one of three characteristics: heart failure on problem list, inpatient loop diuretic, or brain natriuretic peptide≥500 pg/ml; 3) logistic regression of 30 clinically relevant structured data elements; 4) machine learning approach using unstructured notes; 5) machine learning approach using both structured and unstructured data.
Main Outcome and Measure
Heart failure diagnosis, based on discharge diagnosis and physician review of sampled charts.
Results
Of 47,119 included hospitalizations, 6,549 (13.9%) had a discharge diagnosis of heart failure. Inclusion of heart failure on the problem list (algorithm 1) had a sensitivity of 0.40 and positive predictive value (PPV) of 0.96 for heart failure identification. Algorithm 2 improved sensitivity to 0.77 at the expense of PPV of 0.64. Algorithms 3, 4, and 5 had areas under the receiver operating curves (AUCs) of 0.953, 0.969, and 0.974, respectively. With PPV of 0.9, these algorithms had associated sensitivities of 0.68, 0.77, and 0.83, respectively.
Conclusion and Relevance
The problem list is insufficient for real-time identification of hospitalized patients with heart failure. The high predictive accuracy of machine learning using free text demonstrates that support of such analytics in future EHR systems can improve cohort identification.