Objective. Hypoglycemia occurs in 20% to 60% of patients with diabetes mellitus. Identifying at-risk patients can facilitate interventions to lower risk. We sought to develop a hypoglycemia prediction model. Methods. In this retrospective cohort study, urban adults prescribed a diabetes drug between 2004 and 2013 were identified. Demographic and clinical data were extracted from an electronic medical record (EMR). Laboratory tests, diagnostic codes, and natural language processing (NLP) identified hypoglycemia. We compared multiple logistic regression, classification and regression trees (CART), and Random Forest. Models were evaluated on an independent test set or through cross-validation. Results. 38,780 patients had mean age 57 years; 56% were female, 40% African-American, and 39% uninsured. Hypoglycemia occurred in 8,128 (539 identified only by NLP). In logistic regression, factors positively associated with hypoglycemia included infection, non-long-acting insulin, dementia, and recent hypoglycemia. Negatively associated factors included long-acting insulin plus sulfonylurea, and age 75 or older. Models' area under curve was similar (logistic regression, 89%; CART, 88%; Random Forest, 90%, with 10-fold cross-validation). Conclusions. NLP improved identification of hypoglycemia. Non-long-acting insulin was an important risk factor. Decreased risk with age may reflect treatment or diminished awareness of HG. More complex models did not improve prediction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.