2021
DOI: 10.1002/cpe.6621
|View full text |Cite
|
Sign up to set email alerts
|

Parallel and accurate k‐means algorithm on CPU‐GPU architectures for spectral clustering

Abstract: k-Means is a standard algorithm for clustering data. It constitutes generally the final step in a more complex chain of high-quality spectral clustering. However, this chain suffers from lack of scalability when addressing large datasets. This can be overcome by applying also the k-means algorithm as a preprocessing task to reduce the input data instances. We propose parallel optimization techniques for the k-means algorithm on CPU and GPU. Particularly we use a two-step summation method with package processin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 19 publications
0
3
0
Order By: Relevance
“…To meet the requirements for the simultaneous and accurate organization of very large-scale datasets, we have presented a three-step technique. By repeatedly running K-means with a customizable number of clusters, we use an uncommon strategy to select proxied samples in the first step [19]. To refine the selected cases based on their outlier scores, an outlier search method is also applied.…”
Section: Discussionmentioning
confidence: 99%
“…To meet the requirements for the simultaneous and accurate organization of very large-scale datasets, we have presented a three-step technique. By repeatedly running K-means with a customizable number of clusters, we use an uncommon strategy to select proxied samples in the first step [19]. To refine the selected cases based on their outlier scores, an outlier search method is also applied.…”
Section: Discussionmentioning
confidence: 99%
“…k ‐Means is a standard algorithm for clustering data used as the final step for high‐quality spectral clustering. To overcome the scalability challenge when processing large datasets, the authors of paper 2 propose to apply also the k ‐means algorithm as a preprocessing task to reduce the input data instances. Additionally, parallel optimization techniques are introduced to improve the efficiency of the k ‐means algorithm on CPU and GPU.…”
Section: Accepted Papers For the Special Issue: Summarymentioning
confidence: 99%
“…Although many papers have been published on accelerating k-means using GPUs, almost all of them are based on the standard k-means algorithm by applying different optimization techniques on various steps and require the whole dataset to be loaded into the GPU's global memory (Kruliš and Kratochvíl 2020;He et al 2022;Taylor and Gowanlock 2021). To the best of our knowledge, massively parallel processing of the k-means clustering algorithm accelerated with the triangle inequality has not been reported in the literature, especially an algorithm that is capable of handling datasets larger than the global memory of the GPU.…”
Section: Gpu-based K-means Implementationsmentioning
confidence: 99%