2019
DOI: 10.1038/s41598-019-44892-y
|View full text |Cite
|
Sign up to set email alerts
|

Element-centric clustering comparison unifies overlaps and hierarchy

Abstract: Clustering is one of the most universal approaches for understanding complex data. A pivotal aspect of clustering analysis is quantitatively comparing clusterings; clustering comparison is the basis for many tasks such as clustering evaluation, consensus clustering, and tracking the temporal evolution of clusters. In particular, the extrinsic evaluation of clustering methods requires comparing the uncovered clusterings to planted clusterings or known metadata. Yet, as we demonstrate, existing clustering compar… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
82
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 89 publications
(82 citation statements)
references
References 61 publications
(71 reference statements)
0
82
0
Order By: Relevance
“…Disjointedness and cohesion strength take the community identity of the other nodes, but disjointedness focuses on how a node independently changes its community identity apart from the other nodes and cohesion strength only counts the mutual companionship without taking the absence of companionship into count. The Rand index [16] measures the similarity in data clusterings but it is a cluster-centric measure [17], while CoI is node centric. The aforementioned measures also utilize the fuzziness of community [18] as CoI does.…”
Section: Nodesmentioning
confidence: 99%
“…Disjointedness and cohesion strength take the community identity of the other nodes, but disjointedness focuses on how a node independently changes its community identity apart from the other nodes and cohesion strength only counts the mutual companionship without taking the absence of companionship into count. The Rand index [16] measures the similarity in data clusterings but it is a cluster-centric measure [17], while CoI is node centric. The aforementioned measures also utilize the fuzziness of community [18] as CoI does.…”
Section: Nodesmentioning
confidence: 99%
“…Such analysis may be seen as a time‐varying adaptation of the concept of ‘fustrated clusterings’ proposed by Gates et al . (2019), which refers to observations that the method ‘cannot consistently decide on a grouping’ (p. 8). Identify the distinctive features of non‐persistent banks (relative to persistent ones) by comparing banks that changed their business model in a given triennium ( t +1) with other banks that held the same business model in the triennium prior to the change ( t ) and did not change their business model in t +1, with respect to the features exhibited by both banks in triennium t . To undergo this analysis, we run Bayesian logistic regressions ( J regressions; i.e.…”
Section: Methodsmentioning
confidence: 99%
“…As expected, as the disturbances become larger, the similarity of classifications with the baseline sample reduce for all measures (e.g. Gates et al ., 2019). This result suggests that, although the approach handles small disturbances well, practitioners should strive to use a stable sample in a scenario where business model analysis is performed in a time‐varying setting (e.g.…”
Section: Robustness Checksmentioning
confidence: 99%
“…Essentially, these measures assess clustering methods from different viewpoints, and in practice, there is no clustering method that could possibly reach the best performance in all of these performance metrics for a given problem domain [24]. A number of studies revolve around developing performance measures for clustering methods with the aim of determining the appropriateness of the produced clusters [25], [26]. However, surprisingly, although there is an increasing consensus concerning the importance of properly identifying the best clustering method and subsequently interpreting the produced result for a given problem, a limited number of research studies [26]- [28], if any, have comprehensively considered both internal and external measurements for the evaluation of clustering methods in an educational context.…”
Section: Introductionmentioning
confidence: 99%
“…A number of studies revolve around developing performance measures for clustering methods with the aim of determining the appropriateness of the produced clusters [25], [26]. However, surprisingly, although there is an increasing consensus concerning the importance of properly identifying the best clustering method and subsequently interpreting the produced result for a given problem, a limited number of research studies [26]- [28], if any, have comprehensively considered both internal and external measurements for the evaluation of clustering methods in an educational context. In addition to the tedious process behind the experimentation and the data preprocessing, one main reason is that cluster evaluation normally involves multiple conflicting criteria (due to a large number of external and internal metrics).…”
Section: Introductionmentioning
confidence: 99%