Unsupervised Learning Algorithms 2016
DOI: 10.1007/978-3-319-24211-8_4
|View full text |Cite
|
Sign up to set email alerts
|

Clustering Evaluation in High-Dimensional Data

Abstract: Clustering evaluation plays an important role in unsupervised learning systems, as it is often necessary to automatically quantify the quality of generated cluster configurations. This is especially useful for comparing the performance of different clustering algorithms as well as determining the optimal number of clusters in clustering algorithms that do not estimate it internally. Many clustering quality indexes have been proposed over the years and different indexes are used in different contexts. There is … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
19
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(20 citation statements)
references
References 95 publications
1
19
0
Order By: Relevance
“…Arbelaitz et al (2013) compare 30 measures of clustering, some of which are close variants of each other. A similar study, on synthetic data and a smaller number of measures, was undertaken by Tomašev and Radovanović (2016). The experiments by Arbelaitz et al include results on 20 real, labelled datasets of up to 166 features and up to 2310 items-much smaller in both dimensions than is the case for document collections, but in the absence of larger-scale evaluations this work provides the best independent reference of which we are aware.…”
Section: Background On Measurement Of Cluster Qualitymentioning
confidence: 90%
“…Arbelaitz et al (2013) compare 30 measures of clustering, some of which are close variants of each other. A similar study, on synthetic data and a smaller number of measures, was undertaken by Tomašev and Radovanović (2016). The experiments by Arbelaitz et al include results on 20 real, labelled datasets of up to 166 features and up to 2310 items-much smaller in both dimensions than is the case for document collections, but in the absence of larger-scale evaluations this work provides the best independent reference of which we are aware.…”
Section: Background On Measurement Of Cluster Qualitymentioning
confidence: 90%
“…Choice of Dimensionality Reduction Technique Various DR methods are analyzed as to whether they improve the clustering behavior of DBSCAN compared to applying clustering on the original dimensions.Ḡ + , a measure of the discordance between pairs of point distances and is robust w.r.t. differences in dimensionality [25], is used as a metric. It indicates whether members of the same cluster are closer together than members of different clusters.…”
Section: Discussionmentioning
confidence: 99%
“…In particular in high-dimensional space, performance evaluation is a major challenge and, therefore, needs to be optimised iteratively for determining the optimal number of clusters as well as the clustering techniques to be applied. This is done until a cluster split can be found that provides the best performance evaluation values (Tomašev and Radovanović 2016). To this end, external metrics are used in addition to selected internal evaluation metrics from the optimisation of k to assess the formed clusters, to be able to guarantee a holistic perspective and to provide the clearest assessment of the cluster quality.…”
Section: Performance Evaluationmentioning
confidence: 99%