2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI) 2016
DOI: 10.1109/icacci.2016.7732323
|View full text |Cite
|
Sign up to set email alerts
|

A novel K-means based clustering algorithm for big data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(7 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…A parallel implementation of k means algorithm over spark is proposed in [36] large scale text and UCI datasets. In another paper, the authors addressed the issue of predetermining the number of input clusters which is a present problem in most K-means methods by automating the number of input clusters which resulted in better clustering quality when processing large scale data [37].…”
Section: A4 Scalable Methodsmentioning
confidence: 99%
“…A parallel implementation of k means algorithm over spark is proposed in [36] large scale text and UCI datasets. In another paper, the authors addressed the issue of predetermining the number of input clusters which is a present problem in most K-means methods by automating the number of input clusters which resulted in better clustering quality when processing large scale data [37].…”
Section: A4 Scalable Methodsmentioning
confidence: 99%
“…In addition, authors proposed a cluster pruning concept to augment K-Means algorithm to reduce clusters to reduce search space for further computation. A similar effort was made by Sinha and Jana [17] who focused on performing automated cluster formation to cope up with Big Data analytics problems. Considering significance of distance metric in K-Means clustering, Niu [9] applied block function which collects instances as blocks to cluster attributes.…”
Section: Introductionmentioning
confidence: 98%
“…have been developed. However, most of these algorithms employ fixed stopping criteria that forces algorithm to undergo huge computational overheads and time consumption [17], [18]. Here the Author [31] said that Hadoop solves the main problem of processing and storage.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…K-DBSCAN: An improved DBSCAN algorithm for big data In [28], presented in 2016, the dataset was divided into smaller parts and distributed to several nodes in a cluster of machines. Apache Hadoop was used as a scalable, powerful platform for this purpose.…”
mentioning
confidence: 99%