Search citation statements
Paper Sections
Citation Types
Year Published
Publication Types
Relationship
Authors
Journals
Hepatitis B is a potentially deadly liver infection caused by the hepatitis B virus. It is a serious public health problem globally. Substantial efforts have been made to apply machine learning in detecting the virus. However, the application of model interpretability is limited in the existing literature. Model interpretability makes it easier for humans to understand and trust the machine-learning model. Therefore, in this study, we used SHapley Additive exPlanations (SHAP), a game-based theoretical approach to explain and visualize the predictions of machine learning models applied for hepatitis B diagnosis. The algorithms used in building the models include decision tree, logistic regression, support vector machines, random forest, adaptive boosting (AdaBoost), and extreme gradient boosting (XGBoost), and they achieved balanced accuracies of 75%, 82%, 75%, 86%, 92%, and 90%, respectively. Meanwhile, the SHAP values showed that bilirubin is the most significant feature contributing to a higher mortality rate. Consequently, older patients are more likely to die with elevated bilirubin levels. The outcome of this study can aid health practitioners and health policymakers in explaining the result of machine learning models for health-related problems.
Hepatitis B is a potentially deadly liver infection caused by the hepatitis B virus. It is a serious public health problem globally. Substantial efforts have been made to apply machine learning in detecting the virus. However, the application of model interpretability is limited in the existing literature. Model interpretability makes it easier for humans to understand and trust the machine-learning model. Therefore, in this study, we used SHapley Additive exPlanations (SHAP), a game-based theoretical approach to explain and visualize the predictions of machine learning models applied for hepatitis B diagnosis. The algorithms used in building the models include decision tree, logistic regression, support vector machines, random forest, adaptive boosting (AdaBoost), and extreme gradient boosting (XGBoost), and they achieved balanced accuracies of 75%, 82%, 75%, 86%, 92%, and 90%, respectively. Meanwhile, the SHAP values showed that bilirubin is the most significant feature contributing to a higher mortality rate. Consequently, older patients are more likely to die with elevated bilirubin levels. The outcome of this study can aid health practitioners and health policymakers in explaining the result of machine learning models for health-related problems.
This study introduces a novel predictive methodology for diagnosing and predicting gear problems in DC motors. Leveraging AdaBoost with weak classifiers and regressors, the diagnostic aspect categorizes the machine’s current operational state by analyzing time–frequency features extracted from motor current signals. AdaBoost classifiers are employed as weak learners to effectively identify fault severity conditions. Meanwhile, the prognostic aspect utilizes AdaBoost regressors, also acting as weak learners trained on the same features, to predict the machine’s future state and estimate its remaining useful life. A key contribution of this approach is its ability to address the challenge of limited historical data for electrical equipment by optimizing AdaBoost parameters with minimal data. Experimental validation is conducted using a dedicated setup to collect comprehensive data. Through illustrative examples using experimental data, the efficacy of this method in identifying malfunctions and precisely forecasting the remaining lifespan of DC motors is demonstrated.
Feature selection has become essential in classification problems with numerous features. This process involves removing redundant, noisy, and negatively impacting features from the dataset to enhance the classifier’s performance. Some features are less useful than others or do not correlate with the system’s evaluation, and their removal does not affect the system’s performance. In most cases, removing features with a monotonically decreasing impact on the system’s performance increases accuracy. Therefore, this research aims to propose a dimensionality reduction method using a feature selection technique to enhance accuracy. This paper proposes a novel feature-selection approach that combines filter and wrapper techniques to select optimal features using Mutual Information with the Sequential Forward Method and 10-fold cross-validation. Results show that the proposed algorithm can reduce features by more than 75% in datasets with large features and achieve a maximum accuracy of 97%. The algorithm outperforms or performs similarly to existing ones. The proposed algorithm could be a better option for classification problems with minimized features.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.