2015
DOI: 10.5120/ijca2015905919
|View full text |Cite
|
Sign up to set email alerts
|

A Comprehensive Survey on Centroid Selection Strategies for Distributed K-means Clustering Algorithm

Abstract: Extremely large data sets often known as "Big Data" are analyzed for interesting patterns, trends, and associations, especially those relating to human behavior and interactions. Extraction of meaningful and useful information needs to be done in parallel using advanced clustering algorithms. In this paper, effort has been made to tweak in changes to the existing K-means algorithm so as to work in parallel using MapReduce paradigm. K-means due to its gradient descent nature is highly sensitive to the initial p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…Poonam Ghuli et. al [12] has done a comprehensive survey on centroid selection strategies for distributed k-means clustering algorithm. The execution is divided into four modules.…”
Section: Mapreduce Pardigmmentioning
confidence: 99%
“…Poonam Ghuli et. al [12] has done a comprehensive survey on centroid selection strategies for distributed k-means clustering algorithm. The execution is divided into four modules.…”
Section: Mapreduce Pardigmmentioning
confidence: 99%
“…The experimental results of Jin Zhou on the Hadoop distributed platform show that the SPAB-DKMC algorithm can reduce the number of iterations and improve the efficiency of the distributed K -means clustering algorithm. 15 26…”
Section: Introductionmentioning
confidence: 99%
“…The experimental results of Jin Zhou on the Hadoop distributed platform show that the SPAB-DKMC algorithm can reduce the number of iterations and improve the efficiency of the distributed K-means clustering algorithm. [15][16][17][18][19][20][21][22][23][24][25][26] In foreign countries, Poonam Ghuli has proposed an improved density-based distributed clustering (DBDC) algorithm. The algorithm uses the data grid mapping method to map the data objects to the local spatial grid first, which improves the efficiency of local clustering.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation