2023
DOI: 10.32604/iasc.2023.027579
|View full text |Cite
|
Sign up to set email alerts
|

P-ROCK: A Sustainable Clustering Algorithm for Large Categorical Datasets

Abstract: Data clustering is crucial when it comes to data processing and analytics. The new clustering method overcomes the challenge of evaluating and extracting data from big data. Numerical or categorical data can be grouped. Existing clustering methods favor numerical data clustering and ignore categorical data clustering. Until recently, the only way to cluster categorical data was to convert it to a numeric representation and then cluster it using current numeric clustering methods. However, these algorithms coul… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…The detailed analysis and classification are sequentially presented in Section 3.2. Specifically, there are five studies related to hierarchical clustering, with four of them based on rough-set theory (MTMDP [38], MGR [41], MNIG [47], and HPCCD [96]), except the P-ROCK [106]. Two studies focus on agglomerative hierarchical clustering, while the remaining three focus on divisive hierarchical clustering.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…The detailed analysis and classification are sequentially presented in Section 3.2. Specifically, there are five studies related to hierarchical clustering, with four of them based on rough-set theory (MTMDP [38], MGR [41], MNIG [47], and HPCCD [96]), except the P-ROCK [106]. Two studies focus on agglomerative hierarchical clustering, while the remaining three focus on divisive hierarchical clustering.…”
Section: Discussionmentioning
confidence: 99%
“…Many algorithms performed scalability testing, such as those mentioned in references [40,41,56,63,64,67,68,70,96,98,106], aiming to improve clustering methods for highdimensional data. Notably, algorithms like SCC [61] and SKSCC [76] utilize probabilistic distance functions based on kernel density estimation to increase clustering performance.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…ROCK is a hierarchical clustering algorithm that constructs a hierarchy of clusters in the dataset through a bottom-up approach [51,[85][86][87]. Initially, it treats every object as a solo cluster and then merges the more similar clusters to form a new cluster.…”
Section: Rock Algorithmmentioning
confidence: 99%