<p><span lang="EN-US">Breast cancer is the most common cause of death among women worldwide. Breast cancer can be detected early, and the death rate can be reduced. Machine learning techniques are a hot topic for study and have proved influential in cancer prediction and early diagnosis. This study's objective is to predict and diagnose breast cancer using machine learning models and evaluate the most effective based on six criteria: specificity, sensitivity, precision, accuracy, F1-score and receiver operating characteristic curve. All work is done in the anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries, and pandas and matplotlib. This study used the Wisconsin diagnostic breast cancer dataset to test ten machine learning algorithms: decision tree, linear discriminant analysis, forests of randomized trees, gradient boosting, passive aggressive, logistic regression, naïve Bayes, nearest centroid, support vector machine, and perceptron. After collecting the findings, we performed a performance evaluation and compared these various classification techniques. Gradient boosting model outperformed all other algorithms, scoring 96.77% on the F1-score.</span></p>
Breast cancer is the leading cause of death for women worldwide. Cancer can be discovered early, lowering the rate of death. Machine learning techniques are a hot field of research, and they have been shown to be helpful in cancer prediction and early detection. The primary purpose of this research is to identify which machine learning algorithms are the most successful in predicting and diagnosing breast cancer, according to five criteria: specificity, sensitivity, precision, accuracy, and F1 score. The project is finished in the Anaconda environment, which uses Python's NumPy and SciPy numerical and scientific libraries as well as matplotlib and Pandas. In this study, the Wisconsin diagnostic breast cancer dataset was used to evaluate eleven machine learning classifiers: decision tree, quadratic discriminant analysis, AdaBoost, Bagging meta estimator, Extra randomized trees, Gaussian process classifier, Ridge, Gaussian nave Bayes, k-Nearest neighbors, multilayer perceptron, and support vector classifier. During performance analysis, extremely randomized trees outperformed all other classifiers with an F1-score of 96.77% after data collection and data analysis.
Breast cancer is the main death rate from malignant growth worldwide and the most frequently diagnosed type of cancer in females. Machine learning systems have been developed to assist in the accurate detection of cancer. There are numerous methods for cancer detection. But histopathological images are thought to be more precise. In this study, we used the HOG features extractor to extract statistical features from histopathology images of invasive ductal carcinoma. We chose the following images at random from the histopathology images: 100, 200, 400, 1000, and 2000. These statistical features were then used to train several algorithms, including the decision tree, quadratic discriminant analysis, extra randomized trees, gradient boosting, gaussian process classifier, naive bayes, nearest centroid, multilayer perceptron, and support vector machine, to identify whether or not the images depict cancerous or noncancerous growth. The algorithms' performance was evaluated depending on the specificity, accuracy, sensitivity, precision, F1_score, and AUC. The algorithms used worked best when the number of images was set to 100. As the number of images went up, their effectiveness went down.
Breast cancer is the most common type of cancer in women and the leading cause of death from a malignant growth in the world. Machine learning methods have been created to help with cancer detection accuracy. There are several methods for detecting cancer. Histopathological images are more accurate. In this study, we employed the Gabor filter to extract statistical features from invasive ductal carcinoma histopathology images. From the histopathological images, we chose 100, 200, 400, 1000, and 2000 at random. These statistical features were used to train several models to classify these images as malignant or benign, including the decision tree, quadratic discriminant analysis, extra randomized trees, gradient boosting, Gaussian process, Naive Bayes, nearest centroid, multilayer perceptron, and support vector machine. The models' accuracy, sensitivity, specificity, precision, and F1_score were examined. The models produced the highest results when there were 100 images and a wavenumber of 0.2. While as the number of images increased, the models' effectiveness reduced. The most obvious finding to emerge from this study is that we suggest using deep learning instead of machine learning models for large datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.