2020
DOI: 10.1038/s41598-020-66907-9
|View full text |Cite
|
Sign up to set email alerts
|

Predicting breast cancer risk using interacting genetic and demographic factors and machine learning

Abstract: Breast cancer (BC) is a multifactorial disease and the most common cancer in women worldwide. We describe a machine learning approach to identify a combination of interacting genetic variants (SNPs) and demographic risk factors for BC, especially factors related to both familial history (Group 1) and oestrogen metabolism (Group 2), for predicting BC risk. This approach identifies the best combinations of interacting genetic and demographic risk factors that yield the highest BC risk prediction accuracy. In tes… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
28
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 50 publications
(28 citation statements)
references
References 63 publications
0
28
0
Order By: Relevance
“…SHapley Additive exPlanation (SHAP) values were adopted to calculate the contribution of each given feature [ 26 ]. This approach could explain the importance of features for the study outcome, providing visual results for interpreting how the feature value would affect the outcome [ 27 , 28 ]. All of the data preprocessing was performed in R software v4.0.2 (R Foundation, Vienna, Austria), and the related ML analyses were developed in Python 3.7 language.…”
Section: Methodsmentioning
confidence: 99%
“…SHapley Additive exPlanation (SHAP) values were adopted to calculate the contribution of each given feature [ 26 ]. This approach could explain the importance of features for the study outcome, providing visual results for interpreting how the feature value would affect the outcome [ 27 , 28 ]. All of the data preprocessing was performed in R software v4.0.2 (R Foundation, Vienna, Austria), and the related ML analyses were developed in Python 3.7 language.…”
Section: Methodsmentioning
confidence: 99%
“…Badr e and colleagues reported that, by capturing genegene interactions, DL-based PRSs can improve AUC from 0.64 to 0.67, in comparison with regression-based PRS (21). Behravan and colleagues also reported that, by allowing nonparametric interactions between genetic and demographic factors, the AI models improved the mean average precision in breast cancer prediction from 0.74 to 0.78, comparing with the ones using SNPs alone (22). Despite promising data regarding AI-based methods, caution is warranted as the contribution of epistatic effects over and beyond additive effects may be small for complex traits, like breast cancer (23).…”
Section: Is It Time To Move Beyond Predictive Models?mentioning
confidence: 99%
“…Then, we performed the SHAP analysis to evaluate the relative impact of each feature on the XGBoost classifier. [52][53][54][55] Each dot in Fig. 4 indicates the SHAP value of an RBC in the test dataset for the feature mentioned on the left.…”
Section: Characterization Of Thalassemic Rbc Quantitative Phase Imagesmentioning
confidence: 99%