Breast cancer is the most deadly cancer and has highest mortality rate in women all over the world. Early prediction of breast cancer can improve the survival rate of the patient. Consequently, high accuracy in cancer prediction is important to avoid any mis-diagnosis. Machine learning algorithms can contribute in early prediction and diagnosis of breast cancer. In this study, we have used rough set based feature selector to extract relevant features from the breast cancer feature set and classify them using machine learning algorithm like Decision Tree, Naive Bayes, Support Vector Machine, K-Nearest Neighbor, Logistic Regression, Random Forest, Adaboost. The main aim is to predict cancerous breast nodules, using rough set driven feature selection and machine learning classification algorithms. The results were evaluated pertaining to accuracy, sensitivity and specificity and positive predictive value. It is observed that random forest outperformed all other classifiers and achieved the highest accuracy using the proposed approach (95.23%).
The current study proposes an alternative strategy for managing huge and intricate datasets by integrating a number of information removal strategies, including Correlation-based Feature Selection (CFS), Best-First Search (BFS), and Dominance-based Rough Set Approach (DRSA). The goal of this learning is to improve the classifier's classification presentation by removing uncorrelated or unpredictable information values. The planned approach, dubbed CFS-DRSA, entails numerous stages. The operations are carried out sequentially, with two critical feature extraction techniques applied throughout the process's initial phases. Data reduction can be accomplished in the first phase by utilising both the CFS approach and the BFS algorithm. Second, a DRSA is used with a data selection technique to get the most optimal dataset possible for the circumstance. As a result, the investigation's primary objective is to identify a solution to the problem. Machine learning techniques can be used to increase classification precision while minimising calculation time. By including a variety of characteristics and volumes into the design, the experimental strategy was used to authenticate the planned technique. It was able to demonstrate the method's dependability and reliability using widely used assessment methodologies. When compared to other well-known methods, such as deep learning, this is very remarkable (DL). On the overall, the concept is advantageous because it has been demonstrated to aid the classifier in accurately re occurring a relevant result. When applied to incessant value datasets in which the information opinions do not contain any period info and are potentially inaccurate and unclear, the suggested model CFS-DRSA is effective. To validate the performance of the CFS-DRSA technique, a detailed experimental analysis is carried out and the experimental result highlights the betterment of the CFS-DRSA technique.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.