2020
DOI: 10.1073/pnas.1912957117
|View full text |Cite
|
Sign up to set email alerts
|

Testing for dependence on tree structures

Abstract: Tree structures, showing hierarchical relationships and the latent structures between samples, are ubiquitous in genomic and biomedical sciences. A common question in many studies is whether there is an association between a response variable measured on each sample and the latent group structure represented by some given tree. Currently, this is addressed on an ad hoc basis, usually requiring the user to decide on an appropriate number of clusters to prune out of the tree to be tested against the response var… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(15 citation statements)
references
References 20 publications
(34 reference statements)
1
14
0
Order By: Relevance
“…It would be interesting to compare results from a distance dependent CRP with the tree-informed CRP that we have defined here. Yet another alternative is to identify a MAP set of branch mutations b or some other high-confidence set of mutated branches, as in Behr et al (2020) . This has the advantage of avoiding both computational bottlenecks (computing the tree-informed prior and sampling from the posterior), but the method described in Behr et al (2020) does not account for the covariance between individuals induced by combinations of haplotypes and additive effects.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…It would be interesting to compare results from a distance dependent CRP with the tree-informed CRP that we have defined here. Yet another alternative is to identify a MAP set of branch mutations b or some other high-confidence set of mutated branches, as in Behr et al (2020) . This has the advantage of avoiding both computational bottlenecks (computing the tree-informed prior and sampling from the posterior), but the method described in Behr et al (2020) does not account for the covariance between individuals induced by combinations of haplotypes and additive effects.…”
Section: Discussionmentioning
confidence: 99%
“…Ewens’s sampling formula provides an intuitive mechanism for introducing prior information about haplotype relatedness: assuming that the phylogenetic tree of the haplotypes is known rather than random. This defines a prior distribution over the allelic series that is informed by a tree, In this way, our approach is similar to other models that include phylogenetic information; for example, by modeling distributional “changepoints” on a tree ( Ansari and Didelot 2016 ), or by using phylogenetic distance as an input for a distance-dependent CRP ( Cybis et al 2018 ), among others ( Zhang et al 2012 ; Thompson and Kubatko 2013 ; Behr et al 2020 ; Selle et al 2020 ). In particular, Ansari and Didelot (2016) specify a prior distribution over the allelic series by defining the prior probability that each branch of a tree is functionally mutated with respect to a phenotype (in their case, a categorical trait).…”
mentioning
confidence: 99%
“…We note that recent work provides a method that can test for a dependence between the tree structure and a given leaf variable (e.g. isolate phenotype coded as a binary or continuous variable) [ 22 ]. This explicitly takes into account all possible clustering structures and avoids having to arbitrarily choose a cut-off threshold.…”
Section: Methodsmentioning
confidence: 99%
“…For this reason, it is important to note this sensitivity when superimposing meta-data onto a tree structure. This issue can be avoided with the use of a formal statistical test between the tree structure and meta-data as proposed in [22]. Isolates are plotted along the first two principal components.…”
Section: Plos Geneticsmentioning
confidence: 99%
“…For this reason, it is important to note this sensitivity when superimposing meta-data onto a tree structure. It is possible to avoid this issue by using a formal statistical test between the tree structure and meta-data as proposed in [22].…”
Section: Sensitivity Of Hac-based Discrete Clusteringmentioning
confidence: 99%