2015
DOI: 10.1002/cpe.3580
|View full text |Cite
|
Sign up to set email alerts
|

A MapReduce‐based parallel K‐means clustering for large‐scale CIM data verification

Abstract: The Common Information Model (CIM) has been heavily used in electric power grids for data exchange among a number of auxiliary systems such as communication systems, monitoring systems and marketing systems. With an rapid deployment of digitalized devices in electric power networks, the volume of data continuously grows which makes verification of CIM data a challenging issue. This paper presents a parallel K-means for large scale CIM data verification based on the MapReduce computing model which has been wide… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 10 publications
(14 citation statements)
references
References 22 publications
(29 reference statements)
0
14
0
Order By: Relevance
“…K‐means clustering is a well‐known technique for performing non‐hierarchical clustering . In K‐means methods, clusters are groups of data characterized by a small distance to the cluster center. An objective function, typically the sum of the distance to a set of putative cluster centers, is optimized until the best cluster center candidates are found.…”
Section: Related Workmentioning
confidence: 99%
“…K‐means clustering is a well‐known technique for performing non‐hierarchical clustering . In K‐means methods, clusters are groups of data characterized by a small distance to the cluster center. An objective function, typically the sum of the distance to a set of putative cluster centers, is optimized until the best cluster center candidates are found.…”
Section: Related Workmentioning
confidence: 99%
“…To measure the fitness of a scheduler (chromosome), the fitness function is defined using mean square error (MSE) : f()T=truetrue∑i=1kTtrue¯Ti2,trueT¯=truetrue∑i=1kTik where T i represents the processing time for the i th mapper, and trueT¯ represents the average processing time of the number of k mappers. In our design, a single‐point crossover is employed.…”
Section: Algorithm Designmentioning
confidence: 99%
“…Based on Eqs. (11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30), the relationship between data chunks D m and the overall processing time T is established. Therefore, the time T a of the cluster to process data in one processing wave is the maximal one in Eq.…”
Section: Modeling Of Data Processing In Hadoopmentioning
confidence: 99%
See 1 more Smart Citation
“…Ensuring the integrity of transmitted data is of paramount importance for data analysis, the paper ‘A MapReduce based Parallel K‐Means Clustering for Large Scale CIM Data Verification’ discusses the topic and presents a parallel K‐means clustering algorithm for large scale Common Information Model (CIM) data verification. The paper concludes that time saving is achievable using parallel K‐means while generating a high level of precision in data verification.…”
Section: Introductionmentioning
confidence: 99%