Efficient algorithms for inferring evolutionary trees

Gusfield, Dan

doi:10.1002/net.3230210104

Cited by 351 publications

(332 citation statements)

References 12 publications

Supporting

Mentioning

331

Contrasting

Order By: Relevance

“…that they can be placed at the leaves of an evolutionary tree within which each site mutates at most once. Haplotype matrices admitting a perfect phylogeny are completely characterised [8] [9] by the absence of the forbidden submatrix…”

Section: Problem: Parsimony Haplotyping (P H)mentioning

confidence: 99%

Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems

Iersel

Keijsper

Kelk

et al. 2006

Lecture Notes in Computer Science

View full text Add to dashboard Cite

The problem Parsimony Haplotyping (P H) asks for the smallest set of haplotypes which can explain a given set of genotypes, and the problem Minimum Perfect Phylogeny Haplotyping (M P P H) asks for the smallest such set which also allows the haplotypes to be embedded in a perfect phylogeny evolutionary tree, a well-known biologically-motivated data structure. For P H we extend recent work of [16] by further mapping the interface between "easy" and "hard" instances, within the framework of (k, l)-bounded instances. By exploring, in the same way, the tractability frontier of M P P H we provide the first concrete, positive results for this problem, and the algorithms underpinning these results offer new insights about how M P P H might be further tackled in the future. In both P H and M P P H intriguing open problems remain.

show abstract

Section: Problem: Parsimony Haplotyping (P H)mentioning

confidence: 99%

Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems

Iersel

Keijsper

Kelk

et al. 2006

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…He showed that the BPP problem can be solved in linear time [8]. The problem we consider is an extension called the binary near perfect phylogeny reconstruction (BNPP).…”

Section: Preliminariesmentioning

confidence: 99%

“…Input Assumptions: If no pair of characters in input I contains the fourgamete property, we can use Gusfield's elegant algorithm [8] to reconstruct a perfect phylogeny. We assume that the all zeros taxa is present in the input.…”

Section: Lemma 1 [8] the Most Parsimonious Phylogeny For Input I Is mentioning

confidence: 99%

“…Only characters corresponding to non-isolated vertices can mutate more than once in any optimal phylogeny (a simple proof follows from Buneman graphs [16]). Since all characters of C \ M mutate exactly once, the algorithm constructs a perfect phylogeny on this character set using Gusfield's linear time algorithm [8]. The perfect phylogeny is unique because of Lemma 2.…”

Section: Let G(v E) Be the Conflict Graph Of I 2 Let Vnis ⊆ V Be Thmentioning

confidence: 99%

See 1 more Smart Citation

Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees

Sridhar

Dhamdhere

Blelloch

et al. 2006

Computational Science – ICCS 2006

View full text Add to dashboard Cite

Abstract.We consider the problem of reconstructing near-perfect phylogenetic trees using binary character states (referred to as BNPP). A perfect phylogeny assumes that every character mutates at most once in the evolutionary tree, yielding an algorithm for binary character states that is computationally efficient but not robust to imperfections in real data. A near-perfect phylogeny relaxes the perfect phylogeny assumption by allowing at most a constant number q of additional mutations. In this paper, we develop an algorithm for constructing optimal phylogenies and provide empirical evidence of its performance. The algorithm runs in time O((72κ) q nm + nm 2 ) where n is the number of taxa, m is the number of characters and κ is the number of characters that share four gametes with some other character. This is fixed parameter tractable when q and κ are constants and significantly improves on the previous asymptotic bounds by reducing the exponent to q. Furthermore, the complexity of the previous work makes it impractical and in fact no known implementation of it exists. We implement our algorithm and demonstrate it on a selection of real data sets, showing that it substantially outperforms its worstcase bounds and yields far superior results to a commonly used heuristic method in at least one case. Our results therefore describe the first practical phylogenetic tree reconstruction algorithm that finds guaranteed optimal solutions while being easily implemented and computationally feasible for data sets of biologically meaningful size and complexity.

show abstract

“…Numerous phylogenetic inference methods, e.g. maximum parsimony, maximum likelihood, distance matrix fitting, subtrees consistency, and quartet based methods have been proposed over the years [15,1,14,26,17,27,4]; furthermore, it is rather common to compare the same set of species w.r.t. different biological sequences or different genes, hence obtaining various trees.…”

Section: Introductionmentioning

confidence: 99%

Efficient Algorithms for Descendent Subtrees Comparison of Phylogenetic Trees with Applications to Co-evolutionary Classifications in Bacterial Genome

Lin

Hsu

2003

Algorithms and Computation

View full text Add to dashboard Cite

Abstract. A phylogenetic tree is a rooted tree with unbounded degree such that each leaf node is uniquely labelled from 1 to n. The descendent subtree of of a phylogenetic tree T is the subtree composed by all edges and nodes of T descending from a vertex. Given a set of phylogenetic trees, we present linear time algorithms for finding all leaf-agree descendent subtrees as well as all isomorphic descendent subtrees. The normalized cluster distance, d(A, B), of two sets is defined by d(A, B) = ∆(A, B)/(|A| + |B|), where ∆(A, B)denotes the symmetric set difference of two sets. We show that computing all pairs normalized cluster distances between descendent subtrees of two phylogenetic trees can be done in O(n 2 ) time. Since the total size of the outputs will be Θ(n 2 ), the algorithm is thus computationally optimal. A nearest subtree of a subset of leaves is such a descendent subtree that has the smallest normalized cluster distance to these leaves. Here we show that finding nearest subtrees for a collection of pairwise disjointed subsets of leaves can be done in O(n) time. Several applications of these algorithms in areas of bioinformatics is considered. Among them, we discuss the 2CS (Two component systems) functional analysis and classifications on bacterial genome.

show abstract

Efficient algorithms for inferring evolutionary trees

Cited by 351 publications

References 12 publications

Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems

Beaches of Islands of Tractability: Algorithms for Parsimony and Minimum Perfect Phylogeny Haplotyping Problems

Simple Reconstruction of Binary Near-Perfect Phylogenetic Trees

Efficient Algorithms for Descendent Subtrees Comparison of Phylogenetic Trees with Applications to Co-evolutionary Classifications in Bacterial Genome

Contact Info

Product

Resources

About