-Means: A new generalized k-means clustering algorithm

Cheung, Yiu-ming

doi:10.1016/s0167-8655(03)00146-6

Cited by 172 publications

(65 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…This method may not produce fine results whenever the number of clusters is unknown. An improved version of K-means called K*-means has been developed in [24]. It is unable to deal with the noisy data.…”

Section: Literature Surveymentioning

confidence: 99%

HK-Means: A Heuristic Approach to Initialize and Estimate the Number of Clusters in Biological Data

2016

View full text Add to dashboard Cite

K-means algorithm is one of the simplest and fastest clustering algorithms existing since more than four decades. One of the limitations of this algorithm is estimating number of clusters in advance. This algorithm also suffers from random initialization problem. This paper proposes a heuristic which initializes the cluster centers and estimates the number of clusters as a discrete value. The method estimates the number of clusters and initializes many cluster centers successfully for the clusters that are dense and separated significantly. The method selects a new cluster center in each iteration. The point selected is the point which is most dissimilar from the previously chosen points. The proposed algorithm is experimented on various synthetic data and the results are encouraging.

show abstract

Section: Literature Surveymentioning

confidence: 99%

HK-Means: A Heuristic Approach to Initialize and Estimate the Number of Clusters in Biological Data

2016

View full text Add to dashboard Cite

show abstract

“…Applied to cluster analysis, the Mean-Shift algorithm is computationally inexpensive and has a non-parametric clustering procedure which does not require prior knowledge of the number of clusters or nodes, nor does it constrain the shape of the clusters. Contrary to the k-means clustering approach [24,43,89], there are no embedded assumptions on the shape and distribution, the number of nodes or clusters. The Mean-Shift algorithm works well on static probability distributions but not as well as dynamic probability distributions such as movies [27].…”

Section: Mean-shiftmentioning

confidence: 99%

An integrated sign language recognition system

Nel

Ghaziasgar

Connan

2013

Proceedings of the South African Institute for Computer Scientists and Information Technologists Conference

View full text Add to dashboard Cite

“…Diday [24] used different representatives of the clusters (other than the cluster centers), and the Mahalanobis distance is used instead of the Euclidean distance in [61], [18] and elsewhere.…”

Section: Variants Of the K-means Algorithmmentioning

confidence: 99%

“…17) where Q 1 , Q 2 are positive definite, so that 18) and let the probabilities p k (x i ) and cluster sizes q k be given. If the minimizers c 1 , c 2 of (4.18) do not coincide with any of the data points x i , they are given by…”

Section: Centersmentioning

confidence: 99%

Probabilistic Distance Clustering

İyigün

2011

Wiley Encyclopedia of Operations Research and Management Science

View full text Add to dashboard Cite

Probabilistic distance clustering is an iterative method for probabilistic clustering of data. Given clusters, their centers , and the distances of data points from these centers, the probability of cluster membership at any point is assumed to be inversely proportional to the distance from (the center of) the cluster in question. This assumption is the working principle . The method is a generalization, to several centers, of the Weiszfeld method for solving the Fermat–Weber location problem. At each iteration, the distances (Euclidean, Mahalanobis, etc.) from the cluster centers are computed for all data points, and the centers are updated as convex combinations of these points, with weights determined by the above principle. Computations stop when the centers stop moving. Progress is monitored by the joint distance function , a measure of distance from all cluster centers, which evolves during the iterations and captures the data in its low contours. The method is simple, fast (requiring a small number of cheap iterations), and insensitive to outliers.

show abstract

-Means: A new generalized k-means clustering algorithm

Cited by 172 publications

References 7 publications

HK-Means: A Heuristic Approach to Initialize and Estimate the Number of Clusters in Biological Data

HK-Means: A Heuristic Approach to Initialize and Estimate the Number of Clusters in Biological Data

An integrated sign language recognition system

Probabilistic Distance Clustering

Contact Info

Product

Resources

About