2020
DOI: 10.1186/s12859-020-3453-6
|View full text |Cite
|
Sign up to set email alerts
|

Fast tree aggregation for consensus hierarchical clustering

Abstract: Background: In unsupervised learning and clustering, data integration from different sources and types is a difficult question discussed in several research areas. For instance in omics analysis, dozen of clustering methods have been developed in the past decade. When a single source of data is at play, hierarchical clustering (HC) is extremely popular, as a tree structure is highly interpretable and arguably more informative than just a partition of the data. However, applying blindly HC to multiple sources o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 16 publications
(14 citation statements)
references
References 28 publications
(28 reference statements)
1
11
0
Order By: Relevance
“…To construct a consensus dendrogram, hierarchical clusters produced with Copia , Gypsy , and total TE abundances were used. The consensus dendrogram was calculated using the mergeTree v 0.1.3 package in R ( Hulot et al 2019 ). C. serrulata was not included in these analyses, due to the lack of DNA sequencing data for this species.…”
Section: Methodsmentioning
confidence: 99%
“…To construct a consensus dendrogram, hierarchical clusters produced with Copia , Gypsy , and total TE abundances were used. The consensus dendrogram was calculated using the mergeTree v 0.1.3 package in R ( Hulot et al 2019 ). C. serrulata was not included in these analyses, due to the lack of DNA sequencing data for this species.…”
Section: Methodsmentioning
confidence: 99%
“…Consensus clustering, sometimes called aggregated clustering or clustering ensembles, uses multiple clusterings derived from (a) different clustering algorithms, (b) multiple permutations of a single algorithm, or (c) multiple iterations of a single algorithm on subgroups of a dataset to derive one, final set of cluster assignments. Consensus clustering has the theoretical advantages of minimizing overfitting and optimizing stability of cluster assignments, as has been shown for hierarchical clustering on genomic datasets from disparate sources and for identifying subgroups of heterogeneous intensive care unit patients (Vranas et al, 2017;Hulot et al, 2020).…”
Section: Consensus Clusteringmentioning
confidence: 99%
“…Since our approach involves unions of rules, it is partly related to the works where rules or decision trees are merged for various purposes. Hierarchical merging of several trees was addressed in the context of the problem of learning decision trees from multiple sources of the data-so the challenge is to produce one tree that will cover the decisions of others [25]. Another problem that is addressed is construction of consensus trees from different ones with the goal of producing a more stable model [26].…”
Section: Related Work Discussionmentioning
confidence: 99%