Breast cancer is the second most leading cancer occurring in women compared to all other cancers. Around 1.1 million cases were recorded in 2004. Observed rates of this cancer increase with industrialization and urbanization and also with facilities for early detection. It remains much more common in high-income countries but is now increasing rapidly in middle-and low-income countries including within Africa, much of Asia, and Latin America. Breast cancer is fatal in under half of all cases and is the leading cause of death from cancer in women, accounting for 16% of all cancer deaths worldwide. The objective of this research paper is to present a report on breast cancer where we took advantage of those available technological advancements to develop prediction models for breast cancer survivability. We used three popular data mining algorithms (Naïve Bayes, RBF Network, J48) to develop the prediction models using a large dataset (683 breast cancer cases). We also used 10-fold cross-validation methods to measure the unbiased estimate of the three prediction models for performance comparison purposes. The results (based on average accuracy Breast Cancer dataset) indicated that the Naïve Bayes is the best predictor with 97.36% accuracy on the holdout sample (this prediction accuracy is better than any reported in the literature), RBF Network came out to be the second with 96.77% accuracy, J48 came out third with 93.41% accuracy.
Purpose Coronavirus disease is an irresistible infection caused by the respiratory disease coronavirus 2 (SARS-CoV-2). It was first found in Wuhan, China, in December 2019, and has since spread universally, causing a constant pandemic. On June 3, 2020, 6.37 million cases were found in 188 countries and regions. During pandemic prevention, this can minimize the impact of the disease on individuals and groups. A study was carried out on coronavirus to observe the number of cases, deaths, and recovery cases worldwide within a specific time period of 5 months. Based on this data, this research paper will predict the future spread of this infectious disease in human society. Methods In our study, the dataset was taken from WHO "Data WHO Coronavirus Covid-19 cases and deaths-WHO-COVID-19-global-data". This dataset contains information about the observation date, provenance/state, country/region, and latest updates. In this article, we implemented several forecasting techniques: naive method, simple average, moving average, single exponential smoothing, Holt linear trend method, Holt-Winters method and ARIMA, for comparison, and how these methods improve the Root mean square error score. Results The naive method is best suited as described over all other methods. In the ARIMA model, utilizing grid search, we recognized a lot of boundaries that delivered the best-fit model for our time series data. By continuing the model, future predictions of death cases indicate that the number of deaths will increased by more than 600,000 by January 2021. Conclusion This survey will support the government and experts in making arrangements for what is about to happen. Based on the findings of instantaneous model, these models can be adjusted to guide long time.
This article compares six machine learning (ML) algorithms: Classification and Regression Tree (CART), Support Vector Machine (SVM), Naïve Bayes (NB), K-Nearest Neighbors (KNN), Linear Regression (LR) and Multilayer Perceptron (MLP) on the Wisconsin Diagnostic Breast Cancer (WDBC) dataset by estimating their classification test accuracy, standardized data accuracy and runtime analysis. The main objective of this study is to improve the accuracy of prediction using a new statistical method of feature selection. The data set has 32 features, which are reduced using statistical techniques (mode), and the same measurements as above are applied for comparative studies. In the reduced attribute data subset (12 features), we applied 6 integrated models AdaBoost (AB), Gradient Boosting Classifier (GBC), Random Forest (RF), Extra Tree (ET) Bagging and Extra Gradient Boost (XGB), to minimize the probability of misclassification based on any single induced model. We also apply the stacking classifier (Voting Classifier) to basic learners: Logistic Regression (LR), Decision Tree (DT), Support-vector clustering (SVC), K-Nearest Neighbors (KNN), Random Forest (RF) and Naïve Bays (NB) to find out the accuracy obtained by voting classifier (Meta level). To implement the ML algorithm, the data set is divided in the following manner: 80% is used in the training phase and 20% is used in the test phase. To adjust the classifier, manually assigned hyper-parameters are used. At different stages of classification, all ML algorithms perform best, with test accuracy exceeding 90% especially when it is applied to a data subset.
COVID-19 has now taken a frightening form. As the days pass, it is becoming more and more widespread and now it has become an epidemic. The death rate, which was earlier in the hundreds, changed to thousands and then progressed to millions. If the same situation persists over time, the day is not far when the humanity of all the countries on the globe will be endangered and we yearn for breath. From January 2020 till now, many scientists, researchers and doctors have been trying to solve this complex problem so that proper arrangements can be made by the governments in the hospitals and the death rate can be reduced. The presented research article shows the estimated mortality rate by the ARIMA model and the regression model. This dataset has been collected precisely from DataHub-Novel Coronavirus 2019-Dataset from 22nd January to 29th June 2020. To show the current mortality rate of the entire subject, the correlation coefficients of attributes (MAE, MSE, RMSE and MAPE) were used, where the average absolute percentage error validated the model by 99.09%. The ARIMA model is used to generate auto_arima SARIMAX results, auto_arima residual plots, ARIMA model results, and corresponding prediction plots on the training dataset. These data indicate a continuous decline in death cases. By applying a regression model, the coefficients generated by the regression model are estimated, and the actual death cases and expected death cases are compared and analyzed. It is found that the predicted mortality rate has decreased after May 2, 2020. It will help the government and doctors prepare for the forthcoming plans. Based on short-period predictions, these methods can be used to forecast the mortality rate for a long period. Keywords COVID-19 • Epidemic • Humanity • Breath • ARIMA model • Regression model • RMSE This article is part of the topical collection "Advances in Computational Approaches for Artificial Intelligence, Image Processing, IoT and Cloud Applications" guest edited by Bhanu Prakash K N and M. Shivakumar.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.