A Comparison of Machine Learning Techniques for the Detection of Type-2 Diabetes Mellitus: Experiences from Bangladesh

Uddin, Md. Jamal; Ahamad, Md. Martuza; Hoque, Md. Nesarul; Walid, Md. Abul Ala; Aktar, Sakifa; Alotaibi, Naif; Alyami, Salem A.; Kabir, Muhammad Ashad; Moni, Mohammad Ali

doi:10.3390/info14070376

Cited by 14 publications

(9 citation statements)

References 71 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Conversely, when the p -value falls below 0.05, it indicates a probable correlation between the category attributes and the dependent variable. The equation for χ 2 is given below:

where the observed frequencies are denoted as

, the predicted frequencies are denoted as

, and the sample size is denoted as n [ 39 ].…”

Section: Methodsmentioning

confidence: 99%

Interpretable Machine Learning Framework to Predict the Glass Transition Temperature of Polymers

Uddin,

Fan

2024

Polymers

Self Cite

View full text Add to dashboard Cite

The glass transition temperature of polymers is a key parameter in meeting the application requirements for energy absorption. Previous studies have provided some data from slow, expensive trial-and-error procedures. By recognizing these data, machine learning algorithms are able to extract valuable knowledge and disclose essential insights. In this study, a dataset of 7174 samples was utilized. The polymers were numerically represented using two methods: Morgan fingerprint and molecular descriptor. During preprocessing, the dataset was scaled using a standard scaler technique. We removed the features with small variance from the dataset and used the Pearson correlation technique to exclude the features that were highly connected. Then, the most significant features were selected using the recursive feature elimination method. Nine machine learning techniques were employed to predict the glass transition temperature and tune their hyperparameters. The models were compared using the performance metrics of mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2). We observed that the extra tree regressor provided the best results. Significant features were also identified using statistical machine learning methods. The SHAP method was also employed to demonstrate the influence of each feature on the model’s output. This framework can be adaptable to other properties at a low computational expense.

show abstract

“…Conversely, when the p -value falls below 0.05, it indicates a probable correlation between the category attributes and the dependent variable. The equation for χ 2 is given below:

where the observed frequencies are denoted as

, the predicted frequencies are denoted as

, and the sample size is denoted as n [ 39 ].…”

Section: Methodsmentioning

confidence: 99%

Interpretable Machine Learning Framework to Predict the Glass Transition Temperature of Polymers

Uddin,

Fan

2024

Polymers

Self Cite

View full text Add to dashboard Cite

show abstract

“…The evaluation techniques used in this study are based on measures obtained from [62], namely Accuracy (AY), Precision (PN), Recall (RL), F-Measure (FE), Kappa (KA), Log-Loss (LS) and class-specific AUC ROC curves, and Confusion Matrix. These metrics serve as significant benchmarks for assessing the results of the experiment.…”

Section: Performance Measurementioning

confidence: 99%

“…The most significant probability-based order unit of measurement is log-loss. The log-loss metric quantifies the uncertainty of a probabilistic approach by evaluating its accuracy in predicting true labels [62]. A low log-loss value suggests an accurate prediction.…”

Section: Performance Measurementioning

confidence: 99%

Adapted Deep Ensemble Learning-Based Voting Classifier for Osteosarcoma Cancer Classification

Walid,

Mollick,

Shill

et al. 2023

Diagnostics

Self Cite

View full text Add to dashboard Cite

The study utilizes osteosarcoma hematoxylin and the Eosin-stained image dataset, which is unevenly dispersed, and it raises concerns about the potential impact on the overall performance and reliability of any analyses or models derived from the dataset. In this study, a deep-learning-based convolution neural network (CNN) and adapted heterogeneous ensemble-learning-based voting classifier have been proposed to classify osteosarcoma. The proposed methods can also resolve the issue and develop unbiased learning models by introducing an evenly distributed training dataset. Data augmentation is employed to boost the generalization abilities. Six different pre-trained CNN models, namely MobileNetV1, Mo-bileNetV2, ResNetV250, InceptionV2, EfficientNetV2B0, and NasNetMobile, are applied and evaluated in frozen and fine-tuned-based phases. In addition, a novel CNN model and adapted heterogeneous ensemble-learning-based voting classifier developed from the proposed CNN model, fine-tuned NasNetMobile model, and fine-tuned Efficient-NetV2B0 model are also introduced to classify osteosarcoma. The proposed CNN model outperforms other pre-trained models. The Kappa score obtained from the proposed CNN model is 93.09%. Notably, the proposed voting classifier attains the highest Kappa score of 96.50% and outperforms all other models. The findings of this study have practical implications in telemedicine, mobile healthcare systems, and as a supportive tool for medical professionals.

show abstract

“…In order to improve the survival rate of heart failure patients, the extra tree classifier (ETC) was proposed; it uses SMOTE to balance the data [25]. Also, the authors used SMOTE to classify diabetes and reliable stress levels [26,27]. Fitriyani et al proposed using extreme gradient boosting with SMOTE-ENN to solve the cardiovascular prediction problem [28].…”

Section: Introductionmentioning

confidence: 99%

“…Classification on imbalanced datasets can result in biased outcomes, as most standard classification algorithms favor the majority class, leading to poor prediction accuracy for the minority class. To balance the data distribution, most prior studies employed the SMOTE method [24][25][26][27][28], which has some disadvantages. The quality of the samples generated by SMOTE depends on the parameter k, which is difficult to determine due to the variety of datasets.…”

Section: Introductionmentioning

confidence: 99%

Highly Imbalanced Classification of Gout Using Data Resampling and Ensemble Method

Si,

Wang,

et al. 2024

Algorithms

View full text Add to dashboard Cite

Gout is one of the most painful diseases in the world. Accurate classification of gout is crucial for diagnosis and treatment which can potentially save lives. However, the current methods for classifying gout periods have demonstrated poor performance and have received little attention. This is due to a significant data imbalance problem that affects the learning attention for the majority and minority classes. To overcome this problem, a resampling method called ENaNSMOTE-Tomek link is proposed. It uses extended natural neighbors to generate samples that fall within the minority class and then applies the Tomek link technique to eliminate instances that contribute to noise. The model combines the ensemble ’bagging’ technique with the proposed resampling technique to improve the quality of generated samples. The performance of individual classifiers and hybrid models on an imbalanced gout dataset taken from the electronic medical records of a hospital is evaluated. The results of the classification demonstrate that the proposed strategy is more accurate than some imbalanced gout diagnosis techniques, with an accuracy of 80.87% and an AUC of 87.10%. This indicates that the proposed algorithm can alleviate the problems caused by imbalanced gout data and help experts better diagnose their patients.

show abstract

A Comparison of Machine Learning Techniques for the Detection of Type-2 Diabetes Mellitus: Experiences from Bangladesh

Cited by 14 publications

References 71 publications

Interpretable Machine Learning Framework to Predict the Glass Transition Temperature of Polymers

Interpretable Machine Learning Framework to Predict the Glass Transition Temperature of Polymers

Adapted Deep Ensemble Learning-Based Voting Classifier for Osteosarcoma Cancer Classification

Highly Imbalanced Classification of Gout Using Data Resampling and Ensemble Method

Contact Info

Product

Resources

About