Cascading k-means with Ensemble Learning: Enhanced Categorization of Diabetic Data

Karegowda, Asha Gowda; Jayaram, M. A.; Manjunath, A. S.

doi:10.1515/jisys-2012-0010

Cited by 7 publications

(8 citation statements)

References 10 publications

(17 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…This technique, which takes into account both rules and physicians' knowledge, has better accuracy as compared to other prediction approaches [4]. Choubey [10].…”

Section: Related Workmentioning

confidence: 99%

Diabetes Mellitus Data Classification by Cascading of Feature Selection Methods and Ensemble Learning Algorithms

Akyol¹,

Şen²

2018

IJMECS

View full text Add to dashboard Cite

Diabetes is a chronic disease related to the rise of levels of blood glucose. The disease that leads to serious damage to the heart, blood vessels, eyes, kidneys, and nerves is one of the reasons of death among the people in the world. There are two main types of diabetes: Type 1 and Type 2. The former is a chronic condition in which the pancreas produces little or no insulin by itself. The latter usually in adults, occurs when insulin level is insufficient. Classification of diabetes mellitus data which is one of the reasons of death among the people in the world is important. This study which successfully distinguishes diabetes or normal persons contains two major steps. In the first step, the feature selection or weighting methods are analyzed to find the most effective attributes for this disease. In the further step, the performances of AdaBoost, Gradient Boosted Trees and Random Forest ensemble learning algorithms are evaluated. According to experimental results, the prediction accuracy of the combination of Stability Selection method and AdaBoost learning algorithm is a little better than other algorithms with the classification accuracy by 73.88%.

show abstract

“…This technique, which takes into account both rules and physicians' knowledge, has better accuracy as compared to other prediction approaches [4]. Choubey [10].…”

Section: Related Workmentioning

confidence: 99%

Diabetes Mellitus Data Classification by Cascading of Feature Selection Methods and Ensemble Learning Algorithms

Akyol¹,

Şen²

2018

IJMECS

View full text Add to dashboard Cite

show abstract

“…They have used two feature selection methods (i) randomly selected feature and (ii) ensemble has fixed basic set of feature fixed feature sets in each base classifier of ensemble methods; the fixed feature set resulted in enhanced accuracy compared to randomly selected feature set. Among the ensemble methods, the Random forest resulted in best performance with dissimilarity measure-based features [29][30][31][32][33][34].…”

Section: Related Workmentioning

confidence: 99%

“…Rule generation algorithms may result in many spurious, ambiguous, uninteresting and irrelevant rules. Statistical independence and correlation analysis are two approaches applied by Han and Kamber [32] in clearing out uninteresting and misleading data mining patterns. Statistical techniques including chi-squared test [23], log-linear analysis [24] and Regression Analysis [25] help in capturing statistical dependence among data items.…”

Section: Related Workmentioning

confidence: 99%

“…The features identified can be used as input to various classifiers. (Asha et al, Feb 2010, Dec 2010, 2011, 2012, 2014 have applied both filter and wrapper approach using GA and PSO with CFS for identification of significant subset of attributes for various UCI machine learning datasets, which improved the classification accuracy of various classifiers: decision tree, SVM, ANN, Naïve bayes, ensemble methods (bagging, boosting, grading, voting, stacking) in particular for PIMA diabetes dataset.…”

Section: Updation Of Ant Position Is Towards the Elite Antlionmentioning

confidence: 99%

See 1 more Smart Citation

Mushroom Edibility Identification Applying CBR and Ant Lion Techniques in Multi-sensor Environment

Devika

Ramesh

Karegowda³

2021

SN COMPUT. SCI.

Self Cite

View full text Add to dashboard Cite

Wireless sensor networks (WSN) are part of our daily life as they play a vital role in applications of various domains. Energy optimization is a major challenge of WSN as they are operated through the battery. Data exchange is a significant and often performed operation in any WSN which has to be operated with less energy consumption. Bio-inspired clustering protocols are proving a success in minimizing energy consumption and are of current research interest by many researchers. This paper briefs on work carried on the classification of mushroom into edible or non-edible. Since most mushrooms are dangerous to health and may lead to death, henceforth is it is essential to identify the edibility of mushroom. Mushroom features identified by sensor such as ring, odur, spore_print_color, stalk_color_above, stalk_surface_above, and gill size are forwarded to base station (BS). Ant Lion optimization clustering algorithm (ALOC) is adopted in the routing of information to BS. ALOC avoids improper clustering reducing multiple messages at BS. The work is divided into two phases in the initial phase sensing of mushroom features and forwarding to BS is performed through WSN. In the second phase, the decision is taken whether mushroom is edible or not applying class-based association rules (CBA). The simulation results show ALOC is better than LEACH, ABC PSO and MLEACH through evaluation results in terms of network lifetime, energy consumption and retained alive nodes. correlation-based feature selection (CFS) with three filter search techniques: genetic algorithm (GA), evolutionary algorithm (EA), and particle swarm optimization (PSO) at BS as an evaluation method for selection of significant feature selection. CBA rules were generated using these subsets of significant features; hence resulted in a limited number of strong rules which are reliable and sufficient enough to classify the mushroom as edible or not. The patterns and rules generated using the proposed approach avoid the generation of duplicate and irrelevant rules and henceforth simplifies the analysis process using a reliable and interesting set of rules.

show abstract

“…We compare OIILS with some classic ensemble methods [62][63][64][65]. First, we use SMOTE to balance the generated training datasets and utilize KNN as the basic classification algorithm, because the combination of SMOTE and KNN achieves the best performance among all the combinations.…”

Section: Rq4 Experiments Settingmentioning

confidence: 99%

Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy

Guo

et al. 2020

Complexity

View full text Add to dashboard Cite

In software projects, a large number of bugs are usually reported to bug repositories. Due to the limited budge and work force, the developers often may not have enough time and ability to inspect all the reported bugs, and thus they often focus on inspecting and repairing the highly impacting bugs. Among the high-impact bugs, surprise bugs are reported to be a fatal threat to the software systems, though they only account for a small proportion. Therefore, the identification of surprise bugs becomes an important work in practices. In recent years, some methods have been proposed by the researchers to identify surprise bugs. Unfortunately, the performance of these methods in identifying surprise bugs is still not satisfied for the software projects. The main reason is that surprise bugs only occupy a small percentage of all the bugs, and it is difficult to identify these surprise bugs from the imbalanced distribution. In order to overcome the imbalanced category distribution of the bugs, a method based on machine learning to predict surprise bugs is presented in this paper. This method takes into account the textual features of the bug reports and employs an imbalanced learning strategy to balance the datasets of the bug reports. Then these datasets after balancing are used to train three selected classifiers which are built by three different classification algorithms and predict the datasets with unknown type. In particular, an ensemble method named optimization integration is proposed to generate a unique and best result, according to the results produced by the three classifiers. This ensemble method is able to adjust the ability of the classifier to detect different categories based on the characteristics of different projects and integrate the advantages of three classifiers. The experiments performed on the datasets from 4 software projects show that this method performs better than the previous methods in terms of detecting surprise bugs.

show abstract

Cascading k-means with Ensemble Learning: Enhanced Categorization of Diabetic Data

Cited by 7 publications

References 10 publications

Diabetes Mellitus Data Classification by Cascading of Feature Selection Methods and Ensemble Learning Algorithms

Diabetes Mellitus Data Classification by Cascading of Feature Selection Methods and Ensemble Learning Algorithms

Mushroom Edibility Identification Applying CBR and Ant Lion Techniques in Multi-sensor Environment

Surprise Bug Report Prediction Utilizing Optimized Integration with Imbalanced Learning Strategy

Contact Info

Product

Resources

About