BACKGROUND AND AIMS Around the globe, over 850 million patients suffer from chronic kidney disease (CKD). These have associated with high mortality rates, in particular when undergoing renal replacement therapies (RRT) such as dialysis, reaching up to 10% a year, and therefore, are considered of a fragile status. CKD is also associated with cardiovascular complications that can cause mutual aggravation. Available clinical guidelines identify certain risk factors and predictive models, but those have not been tested and validated successfully for renal patients and consequently, there is for the identification of predictive factors and the prediction of mortality. This is caused by the limitations of current methodologies and statistics: current models simplify complex relationships by assuming a linear relation between risk factors and certain events, and so, there is a need for a new approach. Over the last years, a rise in Artificial Intelligence and Machine Learning has been seen, presenting an alternative for the first time. This project aimed to study the performance of different ML algorithms for the prediction of mortality and the identification of risk factors for CKD patients. METHOD Design: Retrospective analysis of a historical cohort from the Register of Renal Patients of Catalonia (RMRC) and the Catalan Agency for Health Quality and Evaluation. Group of 10 473 patients with CKD stages from first to RRT. Follow-up of 11 years, from January 2010 to December 2020. Inclusion criteria: ˃18 years. Training of an Extreme Gradient Boosting model, and comparison with other algorithms for the prediction of mortality at different times, using different follow-up periods for each patient. Methodology: Variables: i) Age, gender, body mass index, time for death (9), ii) Diagnoses (ICD-9/10) (26); iii) Laboratory variables (37) iv) All pharmacological treatments (46). For all executions, data was balanced using the SMOTETomek technique. Analysis: RESULTS The patient sample presented a mean of 68.2 ± 12.9 years and 65.8% were female and 34.2% male. Different follow-up and time windows were tested and the best results were obtained when using a 2-year period follow-up and a 4-year mortality prediction. The Area Under the Curve values obtained for each model were: XBGClassifier (0.89), LGBM Classifier (0.90), CatBoost Classifier (0.91). The 10 variables with major relevance according to the XBGClassifier (54.65% of the total weight of the 71 variables) and in this order, are cardiopathy, advanced chronic kidney disease, vasculopathy, age, neoplasia, transplant, digestive pathology, estimated glomerular filtration rate, high blood pressure. The results presented in Figure and Table correspond to the mean obtained for the 5-folds of the Cross-Validation. CONCLUSION Machine Learning techniques suppose an alternative to classical statistical methods, with a high predictive capacity for mortality. The possibility of generating algorithms with real-world data can allow the individualization of the mortality risk as well as the predictive factors.
Background and Aims Chronic Kidney Disease (CKD) is a common and debilitating condition that affects over 850 million people worldwide. The disease is associated with high mortality rates that can reach up to 10-15% per year, multiple complications, among which cardiovascular ones stand out. These complications can contribute to the progression of CKD, and this in turn to the appearance of complications, feeding each other. Despite the availability of clinical guidelines and predictive models, accurately predicting disease progression and identifying risk factors for progression in CKD patients remains a challenge. The limitations of current methodologies, including simplifying complex relationships and relying on linear assumptions, have hindered progress in this area. The advancement of Artificial Intelligence and Machine Learning has provided a new opportunity to address these challenges. The goal of this study was to evaluate the performance of gradient boosting algorithms in predicting the progression of renal disease in a large dataset of 1327 patients with a follow up of 10 years. Method Design: Retrospective analysis of a historical cohort from the Register of Renal Patients of Catalonia (RMRC) and the Data analytics program for health research and innovation (PADRIS) from Health Quality and Assessment Agency of Catalonia (AQuAS). Inclusion Criteria: > 18 y.o. CKD stages from 2 to Renal Replacement Therapy (RRT) and adequate data after pre-processing the sample. N = 1.327 patients with 27.572 records. Follow up of 10 years (January 2010 - December 2020). Variables: Age, gender, BMI, Diagnoses (ICD-10) = 95, Transplant waiting list status; RRT status; Laboratory variables = 77; f) Pharmacological treatment = 100. Method By using Light Gradient-Boosting Machine (LGBM) testing CKD progression prediction horizon in quarterly windows for multiple periods. Methodology: 1. Pre-processing of the sample and data. 2. Training and testing for variables exploration. 3. Dataset structuring in quarterly windows. 4. Samples randomization and data separation for a 5-fold cross-validation (20% test - 80% training). 5. Training and tuning of LGBM model for different prediction horizons. Results Age: 62 ± 13 years; Gender: 34% female, 66% male. Best prediction horizon was for 8 quarters (2 years), with a ROC curve of 0.967 and accuracy of 0.860. The 10 variables with major relevance in the model in order were estimated Glomerular Filtration Rate, Age, Microalbuminuria, BMI, HDL, Glucose, Urea, Platelets, Triglycerides and Sodium. Conclusion 1. The prediction of CKD progression can benefit from the use of Machine Learning with results that outperform methods based on classical statistics. 2. It can allow the individualization of the prognosis and thus be able to carry out early interventions to improve the prognosis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.