Background Transfusion‐related adverse events can be unrecognized and unreported. As part of the US Food and Drug Administration's Center for Biologics Evaluation and Research Biologics Effectiveness and Safety initiative, we explored whether machine learning methods, such as natural language processing (NLP), can identify and report transfusion allergic reactions (ARs) from electronic health records (EHRs). Study Design and Methods In a 4‐year period, all 146 reported transfusion ARs were pulled from a database of 86,764 transfusions in an academic health system, along with a random sample of 605 transfusions without reported ARs. Structured and unstructured EHR data were retrieved, including demographics, new symptoms, medications, and lab results. In unstructured data, evidence from clinicians' notes, test results, and prescriptions fields identified transfusion ARs, which were used to extract NLP features. Clinician reviews of selected validation cases assessed and confirmed model performance. Results Clinician reviews of selected validation cases yielded a sensitivity of 67.9% and a specificity of 97.5% at a threshold of 0.9, with a positive predictive value (PPV) of 84%, estimated to 4.5% when extrapolated to match transfusion AR incidence in the full transfusion dataset. A higher threshold achieved sensitivity of 43% with specificity/PPV of 100% in our validation set. Essential features predicting ARs were recognized transfusion reactions, administration of antihistamines or glucocorticoids, and skin symptoms (e.g., hives and itching). Removal of NLP features decreased model performance. Discussion NLP algorithms can identify transfusion reactions from the EHR with a reasonable level of precision for subsequent clinician review and confirmation.
Introduction: The Food and Drug Administration Center for Biologics Evaluation and Research conducts post-market surveillance of biologic products to ensure their safety and effectiveness. Studies have found that common vaccine exposures may be missing from structured data elements of electronic health records (EHRs), instead being captured in clinical notes. This impacts monitoring of adverse events following immunizations (AEFIs). For example, COVID-19 vaccines have been regularly administered outside of traditional medical settings. We developed a natural language processing (NLP) algorithm to mine unstructured clinical notes for vaccinations not captured in structured EHR data.Methods: A random sample of 1,000 influenza vaccine administrations, representing 995 unique patients, was extracted from a large U.S. EHR database. NLP techniques were used to detect administrations from the clinical notes in the training dataset [80% (N = 797) of patients]. The algorithm was applied to the validation dataset [20% (N = 198) of patients] to assess performance. Full medical charts for 28 randomly selected administration events in the validation dataset were reviewed by clinicians. The NLP algorithm was then applied across the entire dataset (N = 995) to quantify the number of additional events identified.Results: A total of 3,199 administrations were identified in the structured data and clinical notes combined. Of these, 2,740 (85.7%) were identified in the structured data, while the NLP algorithm identified 1,183 (37.0%) administrations in clinical notes; 459 were not also captured in the structured data. This represents a 16.8% increase in the identification of vaccine administrations compared to using structured data alone. The validation of 28 vaccine administrations confirmed 27 (96.4%) as “definite” vaccine administrations; 18 (64.3%) had evidence of a vaccination event in the structured data, while 10 (35.7%) were found solely in the unstructured notes.Discussion: We demonstrated the utility of an NLP algorithm to identify vaccine administrations not captured in structured EHR data. NLP techniques have the potential to improve detection of vaccine administrations not otherwise reported without increasing the analysis burden on physicians or practitioners. Future applications could include refining estimates of vaccine coverage and detecting other exposures, population characteristics, and outcomes not reliably captured in structured EHR data.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.