Artificial intelligence (AI) is expected to support clinical judgement in medicine. We constructed a new predictive model for diabetic kidney diseases (DKD) using AI, processing natural language and longitudinal data with big data machine learning, based on the electronic medical records (EMR) of 64,059 diabetes patients. AI extracted raw features from the previous 6 months as the reference period and selected 24 factors to find time series patterns relating to 6-month DKD aggravation, using a convolutional autoencoder. AI constructed the predictive model with 3,073 features, including time series data using logistic regression analysis. AI could predict DKD aggravation with 71% accuracy. Furthermore, the group with DKD aggravation had a significantly higher incidence of hemodialysis than the non-aggravation group, over 10 years (N = 2,900). The new predictive model by AI could detect progression of DKD and may contribute to more effective and accurate intervention to reduce hemodialysis.
Artificial intelligence is increasingly being adopted in medical fields to predict various outcomes. In particular, chronic kidney disease (CKD) is problematic because it often progresses to end-stage kidney disease. However, the trajectories of kidney function depend on individual patients. In this study, we propose a machine learning-based model to predict the rapid decline in kidney function among CKD patients by using a big hospital database constructed from the information of 118,584 patients derived from the electronic medical records system. The database included the estimated glomerular filtration rate (eGFR) of each patient, recorded at least twice over a period of 90 days. The data of 19,894 patients (16.8%) were observed to satisfy the CKD criteria. We characterized the rapid decline of kidney function by a decline of 30% or more in the eGFR within a period of two years and classified the available patients into two groups—those exhibiting rapid eGFR decline and those exhibiting non-rapid eGFR decline. Following this, we constructed predictive models based on two machine learning algorithms. Longitudinal laboratory data including urine protein, blood pressure, and hemoglobin were used as covariates. We used longitudinal statistics with a baseline corresponding to 90-, 180-, and 360-day windows prior to the baseline point. The longitudinal statistics included the exponentially smoothed average (ESA), where the weight was defined to be 0.9*(t/b), where t denotes the number of days prior to the baseline point and b denotes the decay parameter. In this study, b was taken to be 7 (7-day ESA). We used logistic regression (LR) and random forest (RF) algorithms based on Python code with scikit-learn library ( https://scikit-learn.org/ ) for model creation. The areas under the curve for LR and RF were 0.71 and 0.73, respectively. The 7-day ESA of urine protein ranked within the first two places in terms of importance according to both models. Further, other features related to urine protein were likely to rank higher than the rest. The LR and RF models revealed that the degree of urine protein, especially if it exhibited an increasing tendency, served as a prominent risk factor associated with rapid eGFR decline.
Background: Diabetic kidney diseases (DKD) including diabetic nephropathy is the most frequent cause of hemodialysis (HD), and more precise prediction model could be useful to early intervention of DKD. Methods: We constructed new prediction model for DKD by using artificial intelligence (AI) based on electronic medical records (EMRs). From EMRs of 64,059 diabetes patients who visited our hospital, we extracted a variety of features. This model uses the stage of nephropathy as labels, and predicts whether the stage 1 patients will move up their stage after 180 days. Results: AI constructed new prediction model by big data machine learning. First, AI extracted raw features in past 6 months at reference period, and selected 22 factors. Then, time series data analysis using convolutional autoencoder was conducted to find time series patterns relating to 6-month DKD aggravation. AI then constructed the prediction model with 17raw features as well as time series and text as secondary features using logistic regression. Finally, AI predicted DKD aggravation with 0.74 AUC score at maximum. Furthermore, DKD aggravation group had significantly higher incidence of HD than non-aggravation group in 10 years. Conclusion: The new prediction model by AI could detect progress of DKD, which could contribute to more effective and accurate intervention to reduce HD. Disclosure M. Makino: Research Support; Spouse/Partner; THE DAI-ICHI LIFE INSURANCE COMPANY, LIMITED. Self. M. Ono: None. T. Itoko: Employee; Self; IBM. T. Katsuki: Employee; Self; IBM. A. Koseki: Employee; Self; IBM. M. Kudo: Employee; Self; IBM. K. Haida: Employee; Self; Daiichilife Insurance Company. J. Kuroda: Employee; Self; The Dai-ichi Life Insurance Company, Limited. R. Yanagiya: None. A. Suzuki: Research Support; Self; THE DAI-ICHI LIFE INSURANCE COMPANY, LIMITED.. Speaker's Bureau; Self; Astellas Pharma Inc., Mitsubishi Tanabe Pharma Corporation. Research Support; Self; EA Pharma Co Ltd, Daiichi Sankyo Company, Limited, Chugai Pharmaceutical Co., Ltd., Kyowa Hakko Kirin Co., Ltd., MSD K.K., Novo Nordisk Inc., Ono Pharmaceutical Co., Ltd., Pfizer Inc., Taisho Pharmaceutical Co., Ltd., Takeda Pharmaceutical Company Limited.
We address the problem of predicting when a disease will develop, i.e., medical event time (MET), from a patient's electronic health record (EHR). The MET of non-communicable diseases like diabetes is highly correlated to cumulative health conditions, more specifically, how much time the patient spent with specific health conditions in the past. The common time-series representation is indirect in extracting such information from EHR because it focuses on detailed dependencies between values in successive observations, not cumulative information. We propose a novel data representation for EHR called cumulative stay-time representation (CTR), which directly models such cumulative health conditions. We derive a trainable construction of CTR based on neural networks that has the flexibility to fit the target data and scalability to handle high-dimensional EHR. Numerical experiments using synthetic and real-world datasets demonstrate that CTR alone achieves a high prediction performance, and it enhances the performance of existing models when combined with them.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.