IMPORTANCE Understanding the differences and potential synergies between traditional clinician assessment and automated machine learning might enable more accurate and useful suicide risk detection. OBJECTIVETo evaluate the respective and combined abilities of a real-time machine learning model and the Columbia Suicide Severity Rating Scale (C-SSRS) to predict suicide attempt (SA) and suicidal ideation (SI). DESIGN, SETTING, AND PARTICIPANTS This cohort study included encounters with adult patients (aged Ն18 years) at a major academic medical center. The C-SSRS was administered during routine care, and a Vanderbilt Suicide Attempt and Ideation Likelihood (VSAIL) prediction was generated in the electronic health record. Encounters took place in the inpatient, ambulatory surgical, and emergency department settings. Data were collected from June 2019 to September 2020. MAIN OUTCOMES AND MEASURES Primary outcomes were the incidence of SA and SI, encoded as International Classification of Diseases codes, occurring within various time periods after an index visit. We evaluated the retrospective validity of the C-SSRS, VSAIL, and ensemble models combining both. Discrimination metrics included area under the receiver operating curve (AUROC), area under the precision-recall curve (AUPR), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). RESULTS The cohort included 120 398 unique index visits for 83 394 patients (mean [SD] age, 51.2 [20.6] years; 38 107 [46%] men; 45 273 [54%] women; 13 644 [16%] Black; 63 869 [77%] White).Within 30 days of an index visit, the combined models had higher AUROC (SA: 0.874-0.887; SI: 0.869-0.879) than both the VSAIL (SA: 0.729; SI: 0.773) and C-SSRS (SA: 0.823; SI: 0.777) models.
Methods relying on diagnostic codes to identify suicidal ideation and suicide attempt in Electronic Health Records (EHRs) at scale are suboptimal because suicide-related outcomes are heavily under-coded. We propose to improve the ascertainment of suicidal outcomes using natural language processing (NLP). We developed information retrieval methodologies to search over 200 million notes from the Vanderbilt EHR. Suicide query terms were extracted using word2vec. A weakly supervised approach was designed to label cases of suicidal outcomes. The NLP validation of the top 200 retrieved patients showed high performance for suicidal ideation (area under the receiver operator curve [AUROC]: 98.6, 95% confidence interval [CI] 97.1–99.5) and suicide attempt (AUROC: 97.3, 95% CI 95.2–98.7). Case extraction produced the best performance when combining NLP and diagnostic codes and when accounting for negated suicide expressions in notes. Overall, we demonstrated that scalable and accurate NLP methods can be developed to identify suicidal behavior in EHRs to enhance prevention efforts, predictive models, and precision medicine.
Treatment-resistant depression (TRD), often defined by absence of symptomatic remission following at least two adequate treatment trials, occurs in roughly a third of all individuals with major depressive disorder (MDD). Prior work has suggested a significant common variant genetic component of liability to TRD, with heritability estimates of 8% when comparing to non-treatment resistant MDD. Despite this evidence of heritability, no replicated genetic loci have been identified and the genetic architecture of TRD remains unclear. A key barrier to this work has been the paucity of adequately powered cohorts for investigation, largely because of the challenge in prospectively investigating this phenotype. Using electroconvulsive therapy (ECT) as a surrogate for TRD, we applied standard machine learning methods to electronic health record (EHR) data to derive predicted probabilities of receiving ECT. We applied these probabilities as a quantitative trait in a genome-wide association study (GWAS) over 154,433 genotyped patients across four large biobanks. With this approach, we demonstrate heritability ranging from 2% to 4.2% and significant genetic overlap with cognition, attention deficit hyperactivity disorder, schizophrenia, alcohol and smoking traits and body mass index. We identify two genome-wide significant loci, both previously implicated in metabolic traits, suggesting shared biology and potential pharmacological implications. This work provides support for the utility of estimation of disease probability for genomic investigation and provides insights into the genetic architecture and biology of TRD.
Objective To develop and validate algorithms for predicting 30-day fatal and nonfatal opioid-related overdose using statewide data sources including prescription drug monitoring program data, Hospital Discharge Data System data, and Tennessee (TN) vital records. Current overdose prevention efforts in TN rely on descriptive and retrospective analyses without prognostication. Materials and Methods Study data included 3 041 668 TN patients with 71 479 191 controlled substance prescriptions from 2012 to 2017. Statewide data and socioeconomic indicators were used to train, ensemble, and calibrate 10 nonparametric “weak learner” models. Validation was performed using area under the receiver operating curve (AUROC), area under the precision recall curve, risk concentration, and Spiegelhalter z-test statistic. Results Within 30 days, 2574 fatal overdoses occurred after 4912 prescriptions (0.0069%) and 8455 nonfatal overdoses occurred after 19 460 prescriptions (0.027%). Discrimination and calibration improved after ensembling (AUROC: 0.79–0.83; Spiegelhalter P value: 0–.12). Risk concentration captured 47–52% of cases in the top quantiles of predicted probabilities. Discussion Partitioning and ensembling enabled all study data to be used given computational limits and helped mediate case imbalance. Predicting risk at the prescription level can aggregate risk to the patient, provider, pharmacy, county, and regional levels. Implementing these models into Tennessee Department of Health systems might enable more granular risk quantification. Prospective validation with more recent data is needed. Conclusion Predicting opioid-related overdose risk at statewide scales remains difficult and models like these, which required a partnership between an academic institution and state health agency to develop, may complement traditional epidemiological methods of risk identification and inform public health decisions.
UNSTRUCTURED By learning complex statistical relationships from historical data, predictive models enable automated, scalable risk detection and prognostication that can inform clinical decision making. Although relatively few have been implemented into clinical use compared to the number developed, predictive models are increasingly being deployed and tested in clinical trials. The stakes in predictive modeling continue to increase including in their regulation by groups like the U.S. Food and Drug Administration. Efforts to standardize steps in model development and validation include statements like TRIPOD and multiple published guidelines on deployment and governance. But the mode in a critical step in model development, the validation strategy, remains a simple "hold-out" or "test-train split", which has been shown to introduce bias, fail to generalize, and hinder clinical utility. Broadly, validation consists of either internal validation, which should be reported alongside model development, or external validation, in which a developed model is tested in an unseen dataset in a new setting. A newer concept of "internal-external" validation has also been suggested for studies with multi-site data.9 Most published models evaluate performance metrics by splitting the available dataset into an independent “hold-out” or “test set”, consisting of unforeseen samples excluded from model training. Such held-out sets are often selected randomly, e.g., "80% training and 20% testing", from data in the original model development setting. In contrast to hold-out validation, cross-validation and resampling methods like bootstrapping can be used to produce less biased estimates of the true out-of-sample performance (i.e., the ability to generalize to new samples). Although cross-validation is a widely used and extensively studied statistical method, many variations of cross-validation exist with respective strengths and weaknesses, distinct use cases for model development and performance estimation that are often misapplied, and domain-specific considerations necessary for effective healthcare implementation. The intent of this tutorial serves to define and compare means of cross-validation using representative, accessible data based in the well-known and well-studied, MIMIC-III dataset. All cross-validation modeling experiments and pre-processing code will be provided through reproducible notebooks that will further guide readers through the comparisons and concepts introduced. Best practices and common missteps particularly in modeling with electronic healthcare data will be emphasized.
Methods relying on diagnostic codes to identify suicidal ideation and suicide attempt in Electronic Health Records (EHRs) at scale are suboptimal because these phenotypes are heavily under-coded. We propose to improve the ascertainment of suicide phenotypes using natural language processing (NLP). We developed information retrieval methodologies to search over 200 million notes from the Vanderbilt EHR. Suicide query terms were extracted using word2vec. A weakly supervised approach was designed to label cases of suicidal outcomes. The NLP validation of the top 200 retrieved patients showed high performance for suicidal ideation (area under the receiver operator curve [AUROC]: 98.6, 95% confidence interval [CI]: 97.1-99.5) and suicide attempt (AUROC: 97.3, 95% CI: 95.2-98.7). Case extraction produced the best performance when combining NLP and diagnostic codes and when accounting for negated suicide expressions in notes. Overall, we demonstrated that scalable and accurate NLP methods can be developed to identify suicide phenotypes in EHRs to enhance prevention efforts, predictive models, and precision medicine.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.