The study proposes a novel machine learning (ML) paradigm for cardiovascular disease (CVD) detection in individuals at medium to high cardiovascular risk using data from a Greek cohort of 542 individuals with rheumatoid arthritis, or diabetes mellitus, and/or arterial hypertension, using conventional or office-based, laboratory-based blood biomarkers and carotid/ femoral ultrasound image-based phenotypes. Two kinds of data (CVD risk factors and presence of CVD-defined as stroke, or myocardial infarction, or coronary artery syndrome, or peripheral artery disease, or coronary heart disease) as ground truth, were collected at two-time points: (i) at visit 1 and (ii) at visit 2 after 3 years. The CVD risk factors were divided into three clusters (conventional or office-based, laboratory-based blood biomarkers, carotid ultrasound image-based phenotypes) to study their effect on the ML classifiers. Three kinds of ML classifiers (Random Forest, Support Vector Machine, and Linear Discriminant Analysis) were applied in a two-fold cross-validation framework using the data augmented by synthetic minority over-sampling technique (SMOTE) strategy. The performance of the ML classifiers was recorded. In this cohort with overall 46 CVD risk factors (covariates) implemented in an online cardiovascular framework, that requires calculation time less than 1 s per patient, a mean accuracy and area-under-the-curve (AUC) of 98.40% and 0.98 (p < 0.0001) for CVD presence detection at visit 1, and 98.39% and 0.98 (p < 0.0001) at visit 2, respectively. The performance of the cardiovascular framework was significantly better than the classical CVD risk score. The ML paradigm proved to be powerful for CVD prediction in individuals at medium to high cardiovascular risk.
KeywordsCardiovascular risk estimation • Cardiovascular disease • Three-year follow-up • Conventional risk factors • Ultrasound • And machine learning Abbreviations ANOVA Analysis of variance ASCVD Atherosclerotic cardiovascular disease AUC Area-under-the-curve BMI Body mass index CAD Coronary artery disease CCVRC Conventional cardiovascular risk calculators Cluster 1 Conventional office-based biomarkers Cluster 2 Fusion of office-based biomarker and laboratory-based biomarkers Cluster 3 Fusion of office-based biomarker, laboratory-based biomarker, and carotid ultrasound image phenotypes CUSIP Carotid ultrasound image phenotype CV Cross-validation CVD Cardiovascular disease CVD-3YFU Cardiovascular disease risk-three-year follow-up CVD-CR Cardiovascular disease-current risk CVE Cardiovascular events DM Diabetes mellitus FH Family history FNR False-negative rate FPR False-positive rate FRS Framingham risk score HTN Hypertension