2016
DOI: 10.5120/ijca2016910868
|View full text |Cite
|
Sign up to set email alerts
|

Comparison between Standard K-Mean Clustering and Improved K-Mean Clustering

Abstract: Clustering in data mining is very important to discover distribution patterns and this importance tends to increase as the amount of data grows. It is one of the main analytical methods in data mining and its method influences its results directly. K-means is a typical clustering algorithm [3]. It mainly consists of two phases i.e. initializing random clusters and to find the nearest neighbour. Both phases have some shortcomings which are discussed in the paper and two methods are purposed based on that. First… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 5 publications
0
4
0
1
Order By: Relevance
“…It is useful for discretizing continuous variables because it computes a continuous distance-based similarity measure to cluster data points [ 69 ]. It originates from signal processing aimed at partitioning and observing k clusters in which each observation is the cluster that has the nearest mean, which serves as the cluster’s prototype [ 70 ]. The discretization strategy for input data occurs via the use of the maximum and minimum dataset values, computed cluster centers, and the midpoints between every two clusters.…”
Section: Methodsmentioning
confidence: 99%
“…It is useful for discretizing continuous variables because it computes a continuous distance-based similarity measure to cluster data points [ 69 ]. It originates from signal processing aimed at partitioning and observing k clusters in which each observation is the cluster that has the nearest mean, which serves as the cluster’s prototype [ 70 ]. The discretization strategy for input data occurs via the use of the maximum and minimum dataset values, computed cluster centers, and the midpoints between every two clusters.…”
Section: Methodsmentioning
confidence: 99%
“…A data point is assigned to the cluster which its centroid is the closest to that data point. After that, it computes a new centroid for each cluster until the best centroid is discovered [25]. Like KMeans, BIRCH is driven by a predefined number of clusters.…”
Section: Clustering Algorithmmentioning
confidence: 99%
“…Each cluster has a centroid (center) and each member object in the cluster has the minimum distance to the centroid and far from other centroids in other clusters. The standard K-means algorithm uses Euclidean distance to compute the distance between each object and the centroid [50]. We propose Algorithm 3, that employs the standard application of the K-means algorithm to support our goal in this step.…”
Section: Clustering-based K-means Algorithmmentioning
confidence: 99%