Objectives:
To construct and validate a prediction model based on machine learning algorithms for early recurrence and metastasis in patients with colorectal cancer after surgery.
Methods:
This study employed a prospective cohort design. A total of 498 postoperative patients with colorectal cancer, treated at an affiliated hospital of Qingdao University, were recruited using convenience sampling from June to December 2021. Data were collected during outpatient visits and hospitalizations. The risk factors for early recurrence and metastasis of colorectal cancer were determined through multivariate logistic regression analysis in SPSS 26.0 software. Using Python 3.7.0 software, four machine learning algorithms (logistic regression, Support Vector Machine, XGBoost, and LightGBM) were used to develop and validate prediction models for early recurrence and metastasis of colorectal cancer after surgery.
Results:
Of the 498 patients, 51 (10.24%) had early recurrence and metastasis. Multivariate logistic regression analysis showed that personal traits (family history of cancer, histological type, degree of tumor differentiation, number of positive lymph nodes, and T stage), behaviour and/or lifestyle (intake of refined grains, whole grains, fish, shrimp, crab, and nuts, as well as resilience), and interpersonal networks (social support) were all associated with early recurrence and metastasis of colorectal cancer (P<0.05). The logistic regression prediction model showed the best prediction performance out of the four models, with an accuracy rate of 0.920, specificity of 0.982, F1 of 0.495, AUC of 0.867, Kappa of 0.056, and Brier score of 0.067.
Conclusion:
Our findings suggest that a prediction model based on logistic regression could accurately and scientifically predict which patients are likely to experience early recurrence and metastasis, helping to lessen the burden for both patients and the healthcare system.