State-of-the-art machine learning (ML) artificial intelligence methods are increasingly leveraged in clinical predictive modeling to provide clinical decision support systems to physicians. Modern ML approaches such as artificial neural networks (ANNs) and tree boosting often perform better than more traditional methods like logistic regression. On the other hand, these modern methods yield a limited understanding of the resulting predictions. However, in the medical domain, understanding of applied models is essential, in particular, when informing clinical decision support. Thus, in recent years, interpretability methods for modern ML methods have emerged to potentially allow explainable predictions paired with high performance. To our knowledge, we present in this work the first explainability comparison of two modern ML methods, tree boosting and multilayer perceptrons (MLPs), to traditional logistic regression methods using a stroke outcome prediction paradigm. Here, we used clinical features to predict a dichotomized 90 days post-stroke modified Rankin Scale (mRS) score. For interpretability, we evaluated clinical features' importance with regard to predictions using deep Taylor decomposition for MLP, Shapley values for tree boosting and model coefficients for logistic regression. With regard to performance as measured by Area under the Curve (AUC) values on the test dataset, all models performed comparably: Logistic regression AUCs were 0.83, 0.83, 0.81 for three different regularization schemes; tree boosting AUC was 0.81; MLP AUC was 0.83. Importantly, the interpretability analysis demonstrated consistent results across models by rating age and stroke severity consecutively amongst the most important predictive features. For less important features, some differences were observed between the methods. Our analysis suggests that modern machine learning methods can provide explainability which is compatible with domain knowledge interpretation and traditional method rankings. Future work should focus on replication of these findings in other datasets and further testing of different explainability methods.
State-of-the-art machine learning (ML) artificial intelligence methods are increasingly leveraged in clinical predictive modeling to provide clinical decision support systems to physicians. Modern ML approaches such as artificial neural networks (ANNs) and tree boosting often perform better than more traditional methods like logistic regression. On the other hand, these modern methods yield a limited understanding of the resulting predictions. However, in the medical domain, understanding of applied models is essential, in particular, when informing clinical decision support. Thus, in recent years, interpretability methods for modern ML methods have emerged to potentially allow explainable predictions paired with high performance. To our knowledge, we present in this work the first explainability comparison of two modern ML methods, tree boosting and multilayer perceptrons (MLPs), to traditional logistic regression methods using a stroke outcome prediction paradigm. Here, we used clinical features to predict a dichotomized 90 days post-stroke modified Rankin Scale (mRS) score. For interpretability, we evaluated clinical features importance with regard to predictions using deep taylor decomposition for MLP, shapley values for tree boosting and model coefficients for logistic regression. With regard to performance as measured by AUC values on the test dataset, all models performed comparably: Logistic regression AUCs were 0.82, 0.82, 0.79 for three different regularization schemes; tree boosting AUC was 0.81; MLP AUC was 0.81. Importantly, the interpretability analysis demonstrated consistent results across models by rating age and stroke severity consecutively amongst the most important predictive features. For less important features, some differences were observed between the methods. Our analysis suggests that modern machine learning methods can provide explainability which is compatible with domain knowledge interpretation and traditional method rankings. Future work should focus on replication of these findings in other datasets and further testing of different explainability methods.
Reliable prediction of outcomes of aneurysmal subarachnoid hemorrhage (aSAH) based on factors available at patient admission may support responsible allocation of resources as well as treatment decisions. Radiographic and clinical scoring systems may help clinicians estimate disease severity, but their predictive value is limited, especially in devising treatment strategies. In this study, we aimed to examine whether a machine learning (ML) approach using variables available on admission may improve outcome prediction in aSAH compared to established scoring systems. Combined clinical and radiographic features as well as standard scores (Hunt & Hess, WFNS, BNI, Fisher, and VASOGRADE) available on patient admission were analyzed using a consecutive single-center database of patients that presented with aSAH (n = 388). Different ML models (seven algorithms including three types of traditional generalized linear models, as well as a tree bosting algorithm, a support vector machine classifier (SVMC), a Naive Bayes (NB) classifier, and a multilayer perceptron (MLP) artificial neural net) were trained for single features, scores, and combined features with a random split into training and test sets (4:1 ratio), ten-fold cross-validation, and 50 shuffles. For combined features, feature importance was calculated. There was no difference in performance between traditional and other ML applications using traditional clinico-radiographic features. Also, no relevant difference was identified between a combined set of clinico-radiological features available on admission (highest AUC 0.78, tree boosting) and the best performing clinical score GCS (highest AUC 0.76, tree boosting). GCS and age were the most important variables for the feature combination. In this cohort of patients with aSAH, the performance of functional outcome prediction by machine learning techniques was comparable to traditional methods and established clinical scores. Future work is necessary to examine input variables other than traditional clinico-radiographic features and to evaluate whether a higher performance for outcome prediction in aSAH can be achieved.
Background and PurposeOutcome prediction after mechanical thrombectomy (MT) in patients with acute ischemic stroke (AIS) and large vessel occlusion (LVO) is commonly performed by focusing on favorable outcome (modified Rankin Scale, mRS 0–2) after 3 months but poor outcome representing severe disability and mortality (mRS 5 and 6) might be of equal importance for clinical decision-making.MethodsWe retrospectively analyzed patients with AIS and LVO undergoing MT from 2009 to 2018. Prognostic variables were grouped in baseline clinical (A), MRI-derived variables including mismatch [apparent diffusion coefficient (ADC) and time-to-maximum (Tmax) lesion volume] (B), and variables reflecting speed and extent of reperfusion (C) [modified treatment in cerebral ischemia (mTICI) score and time from onset to mTICI]. Three different scenarios were analyzed: (1) baseline clinical parameters only, (2) baseline clinical and MRI-derived parameters, and (3) all baseline clinical, imaging-derived, and reperfusion-associated parameters. For each scenario, we assessed prediction for favorable and poor outcome with seven different machine learning algorithms.ResultsIn 210 patients, prediction of favorable outcome was improved after including speed and extent of recanalization [highest area under the curve (AUC) 0.73] compared to using baseline clinical variables only (highest AUC 0.67). Prediction of poor outcome remained stable by using baseline clinical variables only (highest AUC 0.71) and did not improve further by additional variables. Prediction of favorable and poor outcomes was not improved by adding MR-mismatch variables. Most important baseline clinical variables for both outcomes were age, National Institutes of Health Stroke Scale, and premorbid mRS.ConclusionsOur results suggest that a prediction of poor outcome after AIS and MT could be made based on clinical baseline variables only. Speed and extent of MT did improve prediction for a favorable outcome but is not relevant for poor outcome. An MR mismatch with small ischemic core and larger penumbral tissue showed no predictive importance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.