Our system is currently under heavy load due to increased usage. We're actively working on upgrades to improve performance. Thank you for your patience.
2019
DOI: 10.4018/ijdwm.2019100101
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid Partitioning-Density Algorithm for K-Means Clustering of Distributed Data Utilizing OPTICS

Abstract: The authors present the first clustering algorithm for use with distributed data that is fast, reliable, and does not make any presumptions in terms of data distribution. The authors' algorithm constructs a global clustering model using small local models received from local clustering statistics. This approach outperforms the classical non-distributed approaches since it does not require downloading all of the data to the central processing unit. The authors' solution is a hybrid algorithm that uses the best … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 22 publications
0
1
0
Order By: Relevance
“…K‐means is also well supported in the Mlib machine learning library of Apache Spark 18 . For example, DBSCAN, 19 OPTICS, 20,21 and other algorithms perform clustering based on the dense density of datasets in spatial distribution, wherein the number of clusters need not be set in advance; thus, they are particularly suitable for clustering datasets with unknown content. In the context of big data, the optimization and innovation of these algorithms are still very important research prospects 18…”
Section: Related Workmentioning
confidence: 99%
“…K‐means is also well supported in the Mlib machine learning library of Apache Spark 18 . For example, DBSCAN, 19 OPTICS, 20,21 and other algorithms perform clustering based on the dense density of datasets in spatial distribution, wherein the number of clusters need not be set in advance; thus, they are particularly suitable for clustering datasets with unknown content. In the context of big data, the optimization and innovation of these algorithms are still very important research prospects 18…”
Section: Related Workmentioning
confidence: 99%