Breast cancer recurrence is among the most noteworthy fears faced by women. Nevertheless, with modern innovations in data mining technology, early recurrence prediction can help relieve these fears. Although medical information is typically complicated, and simplifying searches to the most relevant input is challenging, new sophisticated data mining techniques promise accurate predictions from high-dimensional data. In this study, the performances of three established data mining algorithms: Naïve Bayes (NB), k-nearest neighbor (KNN), and fast decision tree (REPTree), adopting the feature extraction algorithm, principal component analysis (PCA), for predicting breast cancer recurrence were contrasted. The comparison was conducted between models built in the absence and presence of PCA. The results showed that KNN produced better prediction without PCA (F-measure = 72.1%), whereas the other two techniques: NB and REPTree, improved when used with PCA (F-measure = 76.1% and 72.8%, respectively). This study can benefit the healthcare industry in assisting physicians in predicting breast cancer recurrence precisely.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.