BackgroundThere is currently a lack of model for predicting the occurrence of venous thromboembolism (VTE) in patients with lung cancer. Machine learning (ML) techniques are being increasingly adapted for use in the medical field because of their capabilities of intelligent analysis and scalability. This study aimed to develop and validate ML models to predict the incidence of VTE among lung cancer patients.MethodsData of lung cancer patients from a Grade 3A cancer hospital in China with and without VTE were included. Patient characteristics and clinical predictors related to VTE were collected. The primary endpoint was the diagnosis of VTE during index hospitalization. We calculated and compared the area under the receiver operating characteristic curve (AUROC) using the selected best-performed model (Random Forest model) through multiple model comparison, as well as investigated feature contributions during the training process with both permutation importance scores and the impurity-based feature importance scores in random forest model.ResultsIn total, 3,398 patients were included in our study, 125 of whom experienced VTE during their hospital stay. The ROC curve and precision–recall curve (PRC) for Random Forest Model showed an AUROC of 0.91 (95% CI: 0.893–0.926) and an AUPRC of 0.43 (95% CI: 0.363–0.500). For the simplified model, five most relevant features were selected: Karnofsky Performance Status (KPS), a history of VTE, recombinant human endostatin, EGFR-TKI, and platelet count. We re-trained a random forest classifier with results of the AUROC of 0.87 (95% CI: 0.802–0.917) and AUPRC of 0.30 (95% CI: 0.265–0.358), respectively.ConclusionAccording to the study results, there was no conspicuous decrease in the model’s performance when use fewer features to predict, we concluded that our simplified model would be more applicable in real-life clinical settings. The developed model using ML algorithms in our study has the potential to improve the early detection and prediction of the incidence of VTE in patients with lung cancer.
Background and aims Currently, there are still no definitive consensus in the treatment of intrahepatic cholangiocarcinoma (iCCA). This study aimed to build a clinical decision support tool based on machine learning using the Surveillance, Epidemiology, and End Results (SEER) database and the data from the Fifth Medical Center of the PLA General Hospital in China. Methods 4,398 eligible patients from the SEER database and 504 eligible patients from the hospital data, who presented with histologically proven iCCA, were enrolled for modeling by cross-validation based on machine learning. All the models were trained using the open-source Python library scikit-survival version 0.16.0. Shapley additive explanations method was used to help clinicians better understand the obtained results. Permutation importance was calculated using library ELI5. Results All involved treatment modalities could contribute to a better prognosis. Three models were derived and tested using different data sources, with concordance indices of 0.67, 0.69, and 0.73, respectively. The prediction results were consistent with those under actual situations involving randomly selected patients. Model 2, trained using the hospital data, was selected to develop an online tool, due to its advantage in predicting short-term prognosis. Conclusion The prediction model and tool established in this study can be applied to predict the prognosis of iCCA after treatment by inputting the patient’s clinical parameters or TNM stages and treatment options, thus contributing to optimal clinical decisions. KEY MESSAGES A prognostic model related to disease staging and treatment mode was conducted using the method of machine learning, based on the big data of multi centers. The online calculator can predict the short-term survival prognosis of intrahepatic cholangiocarcinoma, thus, help to make the best clinical decision. The online calculator built to calculate the mortality risk and overall survival can be easily obtained and applied.
Background and aims To date, there is still a lack of consensus on the treatment of intrahepatic cholangiocarcinoma (iCCA). This study aims to build a clinical decision support tool based on machine learning of the Surveillance, Epidemiology, and End Results (SEER) database and the Fifth Medical Center of PLA General Hospital in China. Methods A total of 4,398 eligible patients with pathology-proven iCCA from the SEER database and 504 from the hospital data were enrolled for modeling by cross-validation based on the method of machine learning. All models were trained by the open-source Python library 4scikit-survival version 0.16.0 and explained by SHapley Additive exPlanations. Permutation importance was calculated using the library ELI5. Results All of the involved treatment modalities can contribute to a better prognosis. Three models were derived and tested among different data sources with the concordance index in the test datasets of 0.67, 0.69, and 0.73 respectively. The prediction was also consistent with the actual situation in randomly selected real patients. Model 2 trained by the hospital data was selected to develop an online tool because of the advantage of predicting the short-term prognosis.Conclusion The prediction model and tool of this study can be applied to predicting patients’ prognosis after treatment by inputting the patient’s clinical parameters or TNM stages and the treatment option and thus contribute to the optimal clinical decision.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.