2017
DOI: 10.1002/cpe.4109
|View full text |Cite
|
Sign up to set email alerts
|

A parallel k‐means clustering algorithm based on redundance elimination and extreme points optimization employing MapReduce

Abstract: Summary When facing massive statistical data, the k‐means algorithm is very difficult to satisfy the need of data processing as it lacks an effective parallel mechanism. This paper proposes an improved k‐means algorithm (IMR‐KCA) to conduct clustering analysis based on medical data employing MapReduce computing framework. Through analyzing the defects of vast redundancy in the traditional k‐means algorithms, a selection model is firstly proposed to simplify the computations with multiple clustering centers. Ba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(18 citation statements)
references
References 40 publications
(39 reference statements)
0
18
0
Order By: Relevance
“…The clustering of sensor nodes is usually adopted in largescale networks. Cluster-based networks provide more reliability, better coverage, greater fault tolerance, and better task allocation and energy-efficiency [13][14][15][16][17]. Several clusterbased routing protocols for LLNs/WSNs have been wellstudied and proposed in the last decade in attempts to resolve the "energy-hole" problem [12].…”
Section: Cluster-based Routing Protocolsmentioning
confidence: 99%
“…The clustering of sensor nodes is usually adopted in largescale networks. Cluster-based networks provide more reliability, better coverage, greater fault tolerance, and better task allocation and energy-efficiency [13][14][15][16][17]. Several clusterbased routing protocols for LLNs/WSNs have been wellstudied and proposed in the last decade in attempts to resolve the "energy-hole" problem [12].…”
Section: Cluster-based Routing Protocolsmentioning
confidence: 99%
“…In order to benefit from the high performance of multiprocessor computer systems, many efforts have been made to develop and implement parallel pattern analysis algorithms [1][2][3][4][5][6][7][8][9][10][11]. Improvement for the k-means algorithm (IMR-KCA) proposed in [1]. IMR-KCA provides a selection model to simplify the calculations with multiple clustering centers by analyzing the flaws of vast redundancy in traditional k -means algorithms.…”
Section: Related Researchmentioning
confidence: 99%
“…K‐means clustering is a well‐known technique for performing non‐hierarchical clustering . In K‐means methods, clusters are groups of data characterized by a small distance to the cluster center. An objective function, typically the sum of the distance to a set of putative cluster centers, is optimized until the best cluster center candidates are found.…”
Section: Related Workmentioning
confidence: 99%