2012
DOI: 10.1515/jisys-2012-0010
|View full text |Cite
|
Sign up to set email alerts
|

Cascading k-means with Ensemble Learning: Enhanced Categorization of Diabetic Data

Abstract: This paper illustrates the applications of various ensemble methods for enhanced classification accuracy. The case in point is the Pima Indian Diabetic Dataset (PIDD). The computational model comprises of two stages. In the first stage, k-means clustering is employed to identify and eliminate wrongly classified instances. In the second stage, a fine tuning in the classification was effected. To do this, ensemble methods such as AdaBoost, bagging, dagging, stacking, decorate, rotation forest, random subspace, M… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0
1

Year Published

2015
2015
2023
2023

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(8 citation statements)
references
References 10 publications
(17 reference statements)
0
7
0
1
Order By: Relevance
“…This technique, which takes into account both rules and physicians' knowledge, has better accuracy as compared to other prediction approaches [4]. Choubey [10].…”
Section: Related Workmentioning
confidence: 99%
“…This technique, which takes into account both rules and physicians' knowledge, has better accuracy as compared to other prediction approaches [4]. Choubey [10].…”
Section: Related Workmentioning
confidence: 99%
“…They have used two feature selection methods (i) randomly selected feature and (ii) ensemble has fixed basic set of feature fixed feature sets in each base classifier of ensemble methods; the fixed feature set resulted in enhanced accuracy compared to randomly selected feature set. Among the ensemble methods, the Random forest resulted in best performance with dissimilarity measure-based features [29][30][31][32][33][34].…”
Section: Related Workmentioning
confidence: 99%
“…Rule generation algorithms may result in many spurious, ambiguous, uninteresting and irrelevant rules. Statistical independence and correlation analysis are two approaches applied by Han and Kamber [32] in clearing out uninteresting and misleading data mining patterns. Statistical techniques including chi-squared test [23], log-linear analysis [24] and Regression Analysis [25] help in capturing statistical dependence among data items.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…We compare OIILS with some classic ensemble methods [62][63][64][65]. First, we use SMOTE to balance the generated training datasets and utilize KNN as the basic classification algorithm, because the combination of SMOTE and KNN achieves the best performance among all the combinations.…”
Section: Rq4 Experiments Settingmentioning
confidence: 99%