2022
DOI: 10.3390/app12052405
|View full text |Cite
|
Sign up to set email alerts
|

An Implementation of the HDBSCAN* Clustering Algorithm

Abstract: An implementation of the HDBSCAN* clustering algorithm, Tribuo Hdbscan, is presented in this work. The implementation is developed as a new feature of the Java machine learning library Tribuo. This implementation leverages concurrency and achieves better performance than the reference Java implementation. Tribuo Hdbscan provides prediction functionality, which is a novel technique to make fast predictions for unseen data points using an HDBSCAN* clustering model. Tribuo Hdbscan cluster results and performance … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 34 publications
(22 citation statements)
references
References 13 publications
0
10
0
Order By: Relevance
“…Otherwise, n remains unlabelled. After this process, we cluster the remaining nodes using the HDBSCAN [14] algorithm and add the results to…”
Section: Nei N C Ten N Cmentioning
confidence: 99%
“…Otherwise, n remains unlabelled. After this process, we cluster the remaining nodes using the HDBSCAN [14] algorithm and add the results to…”
Section: Nei N C Ten N Cmentioning
confidence: 99%
“…It does not require each data point to be assigned to a cluster, as it recognises dense clusters. Outliers or noise are points belonging to no cluster group (Stewart & Al-Khassaweneh, 2022).…”
Section: Detection Of Shc Clustersmentioning
confidence: 99%
“…From the tuned model (mtry = 15), we extracted the most important morpho-colorimetric variables and the confusion matrix. Finally, we applied HDBSCAN* (Hierarchical Density-Based Spatial Clustering of Applications with Noise) [81,82], an unsupervised clustering algorithm. Classical clustering techniques such as K-mean are limited by the fact that (1) the number of clusters must be known a priori, (2) each point, even outliers, must belong to a cluster, and lastly (3) they assume some known probability density function (PDF) that may have generated the observed data.…”
Section: Seed Morphometric Analysismentioning
confidence: 99%