2021
DOI: 10.48550/arxiv.2108.10262
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Cube Sampled K-Prototype Clustering for Featured Data

Abstract: Clustering large amount of data is becoming increasingly important in the current times. Due to the large sizes of data, clustering algorithm often take too much time. Sampling this data before clustering is commonly used to reduce this time. In this work, we propose a probabilistic sampling technique called cube sampling along with K-Prototype clustering. Cube sampling is used because of its accurate sample selection. K-Prototype is most frequently used clustering algorithm when the data is numerical as well … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…Prior to using unequal probability sampling, an inclusion probability must be determined. There have been efforts to assess the probability of inclusion for large datasets, such as Jain et al [13] and Nigam et al [16]. This is not a feasible strategy for dealing with multi-dimensional data.…”
Section: Related Workmentioning
confidence: 99%
“…Prior to using unequal probability sampling, an inclusion probability must be determined. There have been efforts to assess the probability of inclusion for large datasets, such as Jain et al [13] and Nigam et al [16]. This is not a feasible strategy for dealing with multi-dimensional data.…”
Section: Related Workmentioning
confidence: 99%