ape is distributed through the Comprehensive R Archive Network: http://cran.r-project.org/web/packages/ape/index.html Further information may be found at http://ape.mpl.ird.fr/pegas/
Orthology detection is an important problem in comparative and evolutionary genomics and, consequently, a variety of orthology detection methods have been devised in recent years. Although many of these methods are dependent on generating gene and/or species trees, it has been shown that orthology can be estimated at acceptable levels of accuracy without having to infer gene trees and/or reconciling gene trees with species trees. Thus, it is of interest to understand how much information about the gene tree, the species tree, and their reconciliation is already contained in the orthology relation on the underlying set of genes. Here we shall show that a result by Böcker and Dress concerning symbolic ultrametrics, and subsequent algorithmic results by Semple and Steel for processing these structures can throw a considerable amount of light on this problem. More specifically, building upon these authors' results, we present some new characterizations for symbolic ultrametrics and new algorithms for recovering the associated trees, with an emphasis on how these algorithms could be potentially extended to deal with arbitrary orthology relations. In so doing we shall also show that, somewhat surprisingly, symbolic ultrametrics are very closely related to cographs, graphs that do not contain an induced path on any subset of four vertices. We conclude with a discussion on how our results might be applied in practice to orthology detection.
A method is described that allows the assessment of treelikeness of phylogenetic distance data before tree estimation. This method is related to statistical geometry as introduced by Eigen, Winkler-Oswatitsch, and Dress (1988 [Proc. Natl. Acad. Sci. USA. 85:5913-5917]), and in essence, displays a measure for treelikeness of quartets in terms of a histogram that we call a delta plot. This allows identification of nontreelike data and analysis of noisy data sets arising from processes such as, for example, parallel evolution, recombination, or lateral gene transfer. In addition to an overall assessment of treelikeness, individual taxa can be ranked by reference to the treelikeness of the quartets to which they belong. Removal of taxa on the basis of this ranking results in an increase in accuracy of tree estimation. Recombinant data sets are simulated, and the method is shown to be capable of identifying single recombinant taxa on the basis of distance information alone, provided the parents of the recombinant sequence are sufficiently divergent and the mixture of tree histories is not strongly skewed toward a single tree. delta Plots and taxon rankings are applied to three biological data sets using distances derived from sequence alignment, gene order, and fragment length polymorphism.
BackgroundTree reconciliation problems have long been studied in phylogenetics. A particular variant of the reconciliation problem for a gene tree T and a species tree S assumes that for each interior vertex x of T it is known whether x represents a speciation or a duplication. This problem appears in the context of analyzing orthology data.ResultsWe show that S is a species tree for T if and only if S displays all rooted triples of T that have three distinct species as their leaves and are rooted in a speciation vertex. A valid reconciliation map can then be found in polynomial time. Simulated data shows that the event-labeled gene trees convey a large amount of information on underlying species trees, even for a large percentage of losses.ConclusionsThe knowledge of event labels in a gene tree strongly constrains the possible species tree and, for a given species tree, also the possible reconciliation maps. Nevertheless, many degrees of freedom remain in the space of feasible solutions. In order to disambiguate the alternative solutions additional external constraints as well as optimization criteria could be employed.
Phylogenetic combinatorics is a branch of discrete applied mathematics concerned with the combinatorial description and analysis of phylogenetic trees and related mathematical structures such as phylogenetic networks and tight spans. Based on a natural conceptual framework, the book focuses on the interrelationship between the principal options for encoding phylogenetic trees: split systems, quartet systems and metrics. Such encodings provide useful options for analyzing and manipulating phylogenetic trees and networks, and are at the basis of much of phylogenetic data processing. This book highlights how each one provides a unique perspective for viewing and perceiving the combinatorial structure of a phylogenetic tree and is, simultaneously, a rich source for combinatorial analysis and theory building. Graduate students and researchers in mathematics and computer science will enjoy exploring this fascinating new area and learn how mathematics may be used to help solve topical problems arising in evolutionary biology.
BackgroundA simple and widely used approach for detecting hybridization in phylogenies is to reconstruct gene trees from independent gene loci, and to look for gene tree incongruence. However, this approach may be confounded by factors such as poor taxon-sampling and/or incomplete lineage-sorting.ResultsUsing coalescent simulations, we investigated the potential of supernetwork methods to differentiate between gene tree incongruence arising from taxon sampling and incomplete lineage-sorting as opposed to hybridization. For few hybridization events, a large number of independent loci, and well-sampled taxa across these loci, we found that it was possible to distinguish incomplete lineage-sorting from hybridization using the filtered Z-closure and Q-imputation supernetwork methods. Moreover, we found that the choice of supernetwork method was less important than the choice of filtering, and that count-based filtering was the most effective filtering technique.ConclusionFiltered supernetworks provide a tool for detecting and identifying hybridization events in phylogenies, a tool that should become increasingly useful in light of current genome sequencing initiatives and the ease with which large numbers of independent gene loci can be determined using new generation sequencing technologies.
Polyploidy, the duplication of entire genomes, plays a major role in plant evolution. In allopolyploids, genome duplication is associated with hybridization between two or more divergent genomes. Successive hybridization and polyploidization events can build up species complexes of allopolyploids with complicated network-like histories, and the evolutionary history of many plant groups cannot be adequately represented by phylogenetic trees because of such reticulate events. The history of complex genome mergings within a high-polyploid species complex in the genus Cerastium (Caryophyllaceae) is here untangled by the use of a network algorithm and noncoding sequences of a low-copy number gene. The resulting network illustrates how hybridization and polyploidization have acted as key evolutionary processes in creating a plant group where high-level allopolyploids clearly outnumber extant parental genomes.
Phylogenetic networks are a generalization of phylogenetic trees that are used to represent non-tree-like evolutionary histories that arise in organisms such as plants and bacteria, or uncertainty in evolutionary histories. An unrooted phylogenetic network on a non-empty, finite set X of taxa, or network, is a connected, simple graph in which every vertex has degree 1 or 3 and whose leaf set is X . It is called a phylogenetic tree if the underlying graph is a tree. In this paper we consider properties of tree-based networks, that is, networks that can be constructed by adding edges into a phylogenetic tree. We show that although they have some properties in common with their rooted analogues which have recently drawn much attention in the literature, they have some striking differences in terms of both their structural and computational properties. We expect that our results could eventually have applications to, for example, detecting horizontal gene transfer or hybridization which are important factors in the evolution of many organisms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.