Power-Efficient Hardware Architecture of K-Means Clustering With Bayesian-Information-Criterion Processor for Multimedia Processing Applications

Chen, Shen‐Ming; Sun, Chih-Hao; Su, Hsiao-Hang; Chien, Shao-Yi; Deguchi, Daisuke; Ide, Ichiro; Murase, Hiroshi

doi:10.1109/jetcas.2011.2165231

Cited by 25 publications

(19 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Besides considering the general setting of distributed environments, K-means has been investigated in the context of specific hardware, e.g., K-means for Graphic Processing Units [11] or supercomputers [4]. Other approaches even design hardware architectures [5] or combined hardwareand software architectures [3] for the sole purpose of efficient K-means clustering. Orthogonal to these approaches for very specific architectures we consider the question how to scale up K-means clustering on a current workstation.…”

Section: Related Work and Discussionmentioning

confidence: 99%

Multi-core K-means

Böhm

Perdacher

Plant

2017

Proceedings of the 2017 SIAM International Conference on Data Mining

View full text Add to dashboard Cite

Today's microprocessors consist of multiple cores each of which can perform multiple additions, multiplications, or other operations simultaneously in one clock cycle. To maximize performance, two types of parallelism must be applied in a data mining algorithm: MIMD (Multiple Instruction Multiple Data) where different CPU cores execute different code and follow different threads of control, and SIMD (Single Instruction Multiple Data) where within a core, the same operation is executed at once on various data. It is commonly agreed among data mining practitioners and researchers that dis-proportionally few works consider the performance potential of today's popular micro-architectures. In this paper, we consider the wide-spread clustering algorithm K-means as a highly relevant use-case for knowledge discovery on big data. We propose Multi-core K-Means (MKM), a completely re-engineered clustering algorithm which applies MIMD and SIMD parallelism. MKM uses a sophisticated strategy for the access of data vectors and cluster representatives to minimize data transfer between main memory, cache, and registers. For SIMD parallelism it is also essential to avoid branching operations like if-then: we propose to code cluster IDs and distances in joint variables to perform the argmin operation SIMD-parallel and without any branching. Our experiments demonstrate a speed-up which is almost linear in the number of cores. On a pair of shared-memory quad-core processors, MKM is between 95 and 140 times faster than non-parallel K-means, 4-6 times faster than auto-vectorized fully parallel standard K-means, and 2.1 times faster than K-means based on BLAS.

show abstract

Section: Related Work and Discussionmentioning

confidence: 99%

Multi-core K-means

Böhm

Perdacher

Plant

2017

Proceedings of the 2017 SIAM International Conference on Data Mining

View full text Add to dashboard Cite

show abstract

“…The core area and power consumption of the proposed engine are 0.36mm 2 and 9.21mW at 100-MHz frequency for VDD = 1.2 V. The engine consumes 62% less power with a comparable area consumption w.r.t. state-ofthe art architecture for ASIC implementation in [8] (the power reported is from back-end simulation using SoC Encounter). A comparison of the area requirement and power consumption of the proposed engine with state-of-the-art architectures have been highlighted in Table I.…”

Section: E Comparison With Other Architecturesmentioning

confidence: 99%

Coordinate Rotation-Based Low Complexity $K$ -Means Clustering Architecture

Adapa

Biswas

Bhardwaj

et al. 2017

IEEE Trans. VLSI Syst.

View full text Add to dashboard Cite

Abstract-In this paper, we propose a low-complexity architectural implementation of the K-Means based clustering algorithm used widely in mobile health monitoring applications for unsupervised and supervised learning. The iterative nature of the algorithm, computing the distance of each data point from a respective centroid for a successful cluster formation until convergence presents a significant challenge to map it onto a lowpower architecture. This has been addressed by the use of a 2-D Coordinate Rotation Digital Computer (CORDIC) based lowcomplexity engine for computing the n-dimensional Euclidean distance involved during clustering. The proposed clustering engine was synthesized using the TSMC 130 nm technology library and a place and route was performed following which the core area and power were estimated as 0.36mm 2 and 9.21mW @ 100 Mhz respectively making the design applicable for low-power real-time operations within a sensor node.Index Terms-K-Means, CORDIC, signal processing, hardware design, low complex architecture.

show abstract

“…The basic idea of K-means algorithm [14][15][16]: Randomly selected K objects, each object as a cluster center. For each remaining object according to its distance from the center of each cluster, assign it to the nearest cluster.…”

Section: K-means Clustering Algorithmmentioning

confidence: 99%