2019
DOI: 10.1109/access.2019.2929866
|View full text |Cite
|
Sign up to set email alerts
|

DMP_MI: An Effective Diabetes Mellitus Classification Algorithm on Imbalanced Data With Missing Values

Abstract: As a widely known chronic disease, diabetes mellitus is called a silent killer. It makes the body produce less insulin and causes increased blood sugar, which leads to many complications and affects the normal functioning of various organs, such as eyes, kidneys, and nerves. Although diabetes has attracted high attention in research, due to the existence of missing values and class imbalance in the data, the overall performance of diabetes classification using machine learning is relatively low. In this paper,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0
5

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 84 publications
(41 citation statements)
references
References 21 publications
0
36
0
5
Order By: Relevance
“…In such a technique, the new imputed value could be far from the central tendency of the population distribution. The performance in the pipeline (see Table 9) employed in [18], [20], [41], [42], [46] is less as comparing the proposed framework and others in [7], [44], [45]. Those fewer performances clearly indicate the role of outlier rejection and filling missing values in the PID dataset.…”
Section: E Results Comparisonmentioning
confidence: 99%
“…In such a technique, the new imputed value could be far from the central tendency of the population distribution. The performance in the pipeline (see Table 9) employed in [18], [20], [41], [42], [46] is less as comparing the proposed framework and others in [7], [44], [45]. Those fewer performances clearly indicate the role of outlier rejection and filling missing values in the PID dataset.…”
Section: E Results Comparisonmentioning
confidence: 99%
“…In addition, a study used the SMOTE with several ML algorithms to predict acute myocardial infarction < 1 month and all-cause mortality < 1 month for MACE in emergency department patients with chest pain [23]. Furthermore, in [24] the authors proposed a prediction algorithm for diabetes mellitus classification on imbalanced data using the adaptive synthetic (ADASYN) sampling technique to reduce the influence of class imbalance, then using a RF classifier to generate predictions models.…”
Section: B Imbalanced Data Solution In Medical Domainmentioning
confidence: 99%
“…However, only a few studies discussed about preprocessing on Pima Indian dataset. The problem of missing value is discussed in a limited number of papers [8,13,14,15,17]. The problem of imbalanced data [10,11,17] and of feature selection [5,9,10,14] have been discussed too.…”
Section: Literature Reviewmentioning
confidence: 99%
“…There are several studies that discussed diabetes diagnosis prediction based on data. Besides Pima Indian dataset [3][4][5][6][7][8][9][10][11][12][13][14][15][16][17], there is also data from Luzhou [4], Irvine [18], Kashmir [19,20], online questionnaire [21], and dr. Schorling [9,21]. There are various classification methods on diabetes diagnosis prediction like random forest, J48, naïve bayes (NB), support vector machine (SVM), logistic regression, neural network (NN), and K-Nearest Neighbors.…”
Section: Introductionmentioning
confidence: 99%