2020
DOI: 10.14569/ijacsa.2020.0110808
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Breast Cancer via Supervised Machine Learning Methods on Class Imbalanced Data

Abstract: A widespread global health concern among women is the incidence of the second most leading cause of fatality which is breast cancer. Predicting the occurrence of breast cancer based on the risk factors will pave the way to an early diagnosis and an efficient treatment in a quicker time. Although there are many predictive models developed for breast cancer in the past, most of these models are generated from highly imbalanced data. The imbalanced data is usually biased towards the majority class but in cancer d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 20 publications
(14 citation statements)
references
References 29 publications
0
14
0
Order By: Relevance
“…For example, by Ayvaci MU et al [ 63 ], the analysis of demographic, mammography, and biopsy data using logistic regression resulted in an AUC of 0.84. Rajendran k et al [ 64 ] analyzed 2.4 million records of mammography screening and demographic risk factors associated with breast cancer to predict breast cancer using the Naïve Bayes, RF, and C4.5 techniques; the findings indicated the highest AUC (0.993) for Naïve Bayes.…”
Section: Discussionmentioning
confidence: 99%
“…For example, by Ayvaci MU et al [ 63 ], the analysis of demographic, mammography, and biopsy data using logistic regression resulted in an AUC of 0.84. Rajendran k et al [ 64 ] analyzed 2.4 million records of mammography screening and demographic risk factors associated with breast cancer to predict breast cancer using the Naïve Bayes, RF, and C4.5 techniques; the findings indicated the highest AUC (0.993) for Naïve Bayes.…”
Section: Discussionmentioning
confidence: 99%
“…C4.5 and EC4.5 are the two famous and most widely used DT algorithms [ 12 ]. DT is used extensively by following reference literature: [ 13 , 14 , 15 , 16 ].…”
Section: Basics and Backgroundmentioning
confidence: 99%
“…In the field of breast cancer predictions, some studies used the logistic regression approaches (Bernal et al, 2017;Oyewola et al, 2017;Westerdijk, 2018;Teja et al, 2020), while other studies used neural networks (Wang and Yoon, 2015;Kourou et al, 2015;Hou et al 2020). Other data mining algorithms were used like decision trees (Rajendran et al, 2020), Naïve Bayes methods (Rajendran et al, 2020;Shieh et al, 2016;Williams et al, 2016), Support Vector Machines (Westerdijk, 2018;Mochen and Sundararajan, 2018;Vard et al, 2018), Random Forests (RF) (Oyewola et al, 2017;Westerdijk, 2018;Hou et al, 2020;Rajendran et al, 2020), optimization algorithms (Vard et al, 2018), etc.…”
Section: Research Articlementioning
confidence: 99%