As per recent trends heart disease has become the major factor for untimely deaths. There are huge amounts of clinical data available from biomedical devices and various applications used by hospitals. Artificial Intelligence is rigorously being used in predicting conditions of heart patients. This is mainly achieved by machine learning where a model is trained with sample cases and is then used for prediction of the ailment as per data available from clinical tests of the patient. This paper focuses in analyzing the accuracy of various classification algorithms, when they are supervised by set of features. Feature selection plays an important role in eliminating redundant and irrelevant features and reduces the training cost and time of the predictive models. The classification algorithms, which have been analyzed include Naive Bayes, Random Forest, Extra Trees and Logistic regression which have been provided with selected features using least absolute shrinkage and selection operator (LASSO) and Ridge regression. The accuracy of the classifiers shows remarkable improvement after using feature selection. The prediction has improved on an average by 33.3% using Lasso regression as compared to 30.73% using ridge regression.
Breast cancer has been identified as the second leading cause of death among women worldwide after lung cancer and hence, it becomes extremely crucial to identify it at an early stage, which can considerably increase the chances of survival. The most important part in cancer detection is to be able to differentiate between benign and malignant tumors and this is where the work of Machine Learning comes in. Taking all the dependent features upon consideration, Supervised Machine Learning methods allow for classification with higher degree of accuracy and improve upon the misdiagnosis of the physicians, which might occur almost 20% of the time. In our paper, we are focusing towards understanding the shortcomings of digital mammograms in detection of breast cancer and utilize Machine Learning classifiers for the classification of benign and malignant tumors using image analysis. Apart from this, we are also looking into implementing Supervised Machine Learning classifiers such as Decision Tree, K Nearest Neighbour (KNN), Random Forest and Gaussian Naive Bayes classifiers for assessing the risks involved with breast cancer by analyzing the biomarkers that are involved with it. Our aim is to provide a comprehensive view on prediction of breast cancer through Machine Learning through both image and data analyses, which can play a pivotal role in prevention of misdiagnosis in future. Fig. 1. gives a layout for the breast cancer prediction using Supervised Machine learning classifiers.
The awareness related to fertility is of great importance due to the change in lifestyle habits. Semen analysis is a reliable confirmatory test to check the fertility in men. The supervised machine learning models of base classifiers include Decision Tree, Logistic Regression and Naive Bayes classifiers in which logistic regression shows a promising accuracy of 88%. Comparing with the bagging ensemble method for the weakest classifier, the results show a leap in accuracy from 78.80% to 90.02%. The authors have also attempted to design a novel voting classifier which votes over the ensemble learners and creates a more complex model to give an accuracy of 89%. Apart from this, the authors have also analyzed the receiver operating characteristic (ROC) curve for Extra Tree classifier which shows a 66% of area under the curve (AUC). The validation procedure used is a 5 fold cross-validation. The authors have further analyzed the lifestyle habits responsible for contributing to this problem based on impurity-based feature selection and have obtained ‘Age' as the most crucial factor in declining seminal quality.
Parkinson’s disease (PD) is a neurodegenerative disorder, which upon progression affects the movements. Tremors associated with Parkinson’s disease are the major symptoms to look out for in such cases. This generally results in breakdown of the neurons producing dopamine. Qualitative speech starts to decline as the disease progresses and the variability in the vocal cord vibration (also known as fundamental frequency) starts to occur. People with PD have shown to produce greater variability in frequency as compared to normal people. In our paper, we are focused on comparison of the voice measurement features of patient dataset to understand whether a patient is suffering from PD or not using Machine Learning classifiers. We have implemented Decision Trees, Logistic Regression and K-nearest neighbors as base classifiers and have compared their performance with Ensemble learning classifiers Bagging, Random Forest and Boosting. We have compared the accuracy (%) of the classifiers and discussed which one of them is more accurate at predicting the outcome of the disease. We also found out the most relevant features associated with the classification and ranked them based on feature importance. Our main aim here is the classification of healthy individuals from people suffering from PD by detection of dysphonia (difficulty in speaking due to declining health conditions).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.