2007
DOI: 10.1089/cmb.2007.r012
|View full text |Cite
|
Sign up to set email alerts
|

Efficiently Computing the Robinson-Foulds Metric

Abstract: The Robinson-Foulds (RF) metric is the measure most widely used in comparing phylogenetic trees; it can be computed in linear time using Day's algorithm. When faced with the need to compare large numbers of large trees, however, even linear time becomes prohibitive. We present a randomized approximation scheme that provides, in sublinear time and with high probability, a (1 + epsilon) approximation of the true RF metric. Our approach is to use a sublinear-space embedding of the trees, combined with an applicat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
41
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
3
3
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 69 publications
(41 citation statements)
references
References 13 publications
0
41
0
Order By: Relevance
“…maximizes) this cost and c norm (G, S) the normalized cost over the range of possible values and defined as follows: S min )). The similarity between S and the species tree S 0 is evaluated by the classical Robinson and Foulds distance between phylogenetic trees [15,16], denoted RF (S 0 , S). For the three costs, Figures 3 (left) below depicts the distribution of each tree S ∈ K n according to the normalized cost l norm (G, S).…”
Section: Resultsmentioning
confidence: 99%
“…maximizes) this cost and c norm (G, S) the normalized cost over the range of possible values and defined as follows: S min )). The similarity between S and the species tree S 0 is evaluated by the classical Robinson and Foulds distance between phylogenetic trees [15,16], denoted RF (S 0 , S). For the three costs, Figures 3 (left) below depicts the distribution of each tree S ∈ K n according to the normalized cost l norm (G, S).…”
Section: Resultsmentioning
confidence: 99%
“…The Robinson-Foulds (RF) distance [33] is a commonly used metric for matching phylogenetic trees [34]. Given two trees T1 and T2, each contains m number of leaves; then C1 is a set consisting of m − 1 subsets each signifying one of the m − 1 nodes of T1.…”
Section: Robinson-foulds Distancementioning
confidence: 99%
“…The RF distance between two trees is defined by the sum of the number of data partitions implied by one, but not both, of the trees. A variety of algorithms exist for computing RF distance [76,77], and an optimal method is usually selected on the basis of algorithmic complexity and worst-case running time [51,78,79]. In this study, a gene network was constructed based on tree topology similarity using RF (transformed to RF 0 , which converts RF values onto a scale where higher values correspond to less similarity).…”
Section: Phylogenetic Analysismentioning
confidence: 99%