Síndrome de Alicia en el país de las maravillas

Clustering analysis plays an important role in scientific research and commercial application. K-means algorithm is a widely used partition method in clustering. However, it is known that the K-means algorithm may get stuck at suboptimal solutions, depending on the choice of the initial cluster centers. In this article, we propose a technique to handle large scale data, which can select initial clustering center purposefully using Genetic algorithms (GAs), reduce the sensitivity to isolated point, avoid dissevering big cluster, and overcome deflexion of data in some degree that caused by the disproportion in data partitioning owing to adoption of multi-sampling.We applied our method to some public datasets these show the advantages of the proposed approach for example Hepatitis C dataset that has been taken from the machine learning warehouse of University of California. Our aim is to evaluate hepatitis dataset. In order to evaluate this dataset we did some preprocessing operation, the reason to preprocessing is to summarize the data in the best and suitable way for our algorithm. Missing values of the instances are adjusted using local mean method.

show abstract

Genetic K-Means Adaption Algorithm for Clustering Stakeholders in System Requirements

Reyad

Dukhan

Marghny

et al. 2021

View full text Add to dashboard Cite

Outlier Detection Using Improved Genetic K-Means

Marghny

Taloba

2011

SSRN Journal

View full text Add to dashboard Cite

Fast Efficient Clustering Algorithm for Balanced Data

et al. 2014

View full text Add to dashboard Cite

Abstract-The Cluster analysis is a major technique for statistical analysis, machine learning, pattern recognition, data mining, image analysis and bioinformatics. K-means algorithm is one of the most important clustering algorithms. However, the kmeans algorithm needs a large amount of computational time for handling large data sets. In this paper, we developed more efficient clustering algorithm to overcome this deficiency named Fast Balanced k-means (FBK-means). This algorithm is not only yields the best clustering results as in the k-means algorithm but also requires less computational time. The algorithm is working well in the case of balanced data.

show abstract

Outlier Detection using Improved Genetic K-means

Marghny¹,

Taloba²

2011

IJCA

View full text Add to dashboard Cite

The outlier detection problem in some cases is similar to the classification problem. For example, the main concern of clustering-based outlier detection algorithms is to find clusters and outliers, which are often regarded as noise that should be removed in order to make more reliable clustering.In this article, we present an algorithm that provides outlier detection and data clustering simultaneously. The algorithmimprovesthe estimation of centroids of the generative distribution during the process of clustering and outlier discovery. The proposed algorithm consists of two stages. The first stage consists of improved genetic k-means algorithm (IGK) process, while the second stage iteratively removes the vectors which are far from their cluster centroids.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

M. H. Marghny

An Effective Evolutionary Clustering Algorithm: Hepatitis C Case Study

Genetic K-Means Adaption Algorithm for Clustering Stakeholders in System Requirements

Outlier Detection Using Improved Genetic K-Means

Fast Efficient Clustering Algorithm for Balanced Data

Outlier Detection using Improved Genetic K-means

Contact Info

Product

Resources

About