The program to estimate our sequence distance, is available at http://www.cs.cityu.edu.hk/~cssamk/gencomp/GenCompress1.htm. The distance matrices used to generate our phylogenies are available at http://www.math.uwaterloo.ca/~mli/distance.html.
Inferring evolutionary trees has long been a challenging problem both for biologists and computer scientists. In recent years research has concentrated on the quartet method paradigm for inferring evolutionary trees. Quartet methods proceed by rst inferring the evolutionary history for every set of four species (resulting in a set Q of inferred quartet topologies) and then recombining these inferred quartet topologies to form an evolutionary tree. This paper presents two results on the quartet method paradigm. The rst is a polynomial time approximation scheme (PTAS) for recombining the inferred quartet topologies optimally. This is an important result since, to date, there have been no polynomial time algorithms with performance guarantees for quartet methods. To achieve this result the natural denseness of the set Q is exploited. The second result is a new technique, called quartet cleaning, that detects and corrects errors in the set Q with performance guarantees. This result has particular signi cance since quartet methods are usually very sensitive to errors in the data. It is shown how quartet cleaning can dramatically increase the accuracy of quartet methods.
Abstract. In this paper, we introduce new algorithms for selecting taxon (leaf) samples from large phylogenetic trees, uniformly at random, under certain biologically relevant constraints on the taxa. All the algorithms run in polynomial time and have been implemented. The algorithms have direct applications to the evaluation of phylogenetic tree and supertree construction methods using biologically curated data. We also relate one of the sampling problems to the well-known clique problem on undirected graphs. From this, we obtain an interesting new class of graphs for which many open problems exist.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.