2023
DOI: 10.1371/journal.pone.0281922
|View full text |Cite
|
Sign up to set email alerts
|

Increasing transparency in machine learning through bootstrap simulation and shapely additive explanations

Abstract: Machine learning methods are widely used within the medical field. However, the reliability and efficacy of these models is difficult to assess, making it difficult for researchers to identify which machine-learning model to apply to their dataset. We assessed whether variance calculations of model metrics (e.g., AUROC, Sensitivity, Specificity) through bootstrap simulation and SHapely Additive exPlanations (SHAP) could increase model transparency and improve model selection. Data from the England National Hea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
6

Relationship

4
2

Authors

Journals

citations
Cited by 25 publications
(27 citation statements)
references
References 65 publications
0
19
0
Order By: Relevance
“…These transparent machine‐learning tools allow for increased confidence that these algorithms are picking up true signal within these covariates to predict the presence of depression rather than just replicating potential biases stemming from systemic data‐quality errors that are present within the data set. Additionally, these SHAP visualizations allow us to interpret that the increase predictive power of these machine‐learning methods is associated with the ability for these nonparametric methods to more accurately capture the nonlinear interactive relationship between the covariates, rather than just over‐fitting the model to get increased accuracy 52,53 …”
Section: Discussionmentioning
confidence: 99%
“…These transparent machine‐learning tools allow for increased confidence that these algorithms are picking up true signal within these covariates to predict the presence of depression rather than just replicating potential biases stemming from systemic data‐quality errors that are present within the data set. Additionally, these SHAP visualizations allow us to interpret that the increase predictive power of these machine‐learning methods is associated with the ability for these nonparametric methods to more accurately capture the nonlinear interactive relationship between the covariates, rather than just over‐fitting the model to get increased accuracy 52,53 …”
Section: Discussionmentioning
confidence: 99%
“…For continuous variables, data were expressed in the following format mean ± standard deviation (SD), and for categorical variables, they were expressed as proportions. For the relevant covariates on demographics and exercise, chi‐squared was used for categorical variables for continuous variables, the Shapiro–Wilk test for normality was performed and t ‐tests were performed for continuous and normally distributed variables and non‐parametric Wilcoxon tests were utilized for non‐normally distributed variables to compare differences amongst those with clinical depression and those without 22 . Univariable models assessed the effect of vigorous exercise and sedentary activity on clinical depression risk.…”
Section: Methodsmentioning
confidence: 99%
“…While the focus of ML is on prediction, and a causal relationship cannot be assumed of the covariates found to have high predictive value, identification of novel risk factors for hypothesis generation and further research can be useful as seen in transfusion-associated lung injury (TRALI) 85 and in pediatric transfusion-associated hyperkalemia. 86 Recognizing that transparency and accountability are essential for clinicians in generating hypotheses, Zhu et al 87,88 focus on explainable AI when presenting adverse events during neonatal hyperbilirubinemia exchange transfusion, particularly through use of SHapley Additive exPlanation (SHAP).…”
Section: Hemovigilancementioning
confidence: 99%