Phylogeny reconstruction is a key instrument in numerous biological analyses, ranging from evolutionary and ecology research, to conservation and systems biology. The increasing accumulation of genomic data makes it possible to reconstruct phylogenies with both high accuracy and at increasingly finer resolution. Yet, taking advantage of the enormous amount of sequence data available requires the use of computational tools for efficient data retrieval and processing, or else the process could quickly become an error-prone endeavour. Here, we present OneTwoTree (http://onetwotree.tau.ac.il/), a Web-based tool for tree reconstruction based on the supermatrix paradigm. Given a list of taxa names of interest as the sole input requirement, OneTwoTree retrieves all available sequence data from NCBI GenBank, clusters these into orthology groups, identifies the most informative set of markers, searches for an appropriate outgroup, and assembles a partitioned sequence matrix that is then used for the final phylogeny reconstruction step. OneTwoTree further allows users to control various steps of the process, such as the merging of sequences from similar clusters, or phylogeny reconstruction based on markers from a specific genome type. By comparing the performance of OneTwoTree to a manually reconstructed phylogeny of the Antirrhineae tribe, we show that the use of OneTwoTree resulted in substantially higher data coverage in terms of both taxon sampling and the number of informative markers assembled. OneTwoTree provides a flexible online tool for species-tree reconstruction, aimed to assist researchers ranging in their level of prior expertise in the task of phylogeny reconstruction.
Summary Chromosome number is a central feature of eukaryote genomes. Deciphering patterns of chromosome‐number change along a phylogeny is central to the inference of whole genome duplications and ancestral chromosome numbers. ChromEvol is a probabilistic inference tool that allows the evaluation of several models of chromosome‐number evolution and their fit to the data. However, fitting a model does not necessarily mean that the model describes the empirical data adequately. This vulnerability may lead to incorrect conclusions when model assumptions are not met by real data. Here, we present a model adequacy test for likelihood models of chromosome‐number evolution. The procedure allows us to determine whether the model can generate data with similar characteristics as those found in the observed ones. We demonstrate that using inadequate models can lead to inflated errors in several inference tasks. Applying the developed method to 200 angiosperm genera, we find that in many of these, the best‐fitting model provides poor fit to the data. The inadequacy rate increases in large clades or in those in which hybridizations are present. The developed model adequacy test can help researchers to identify phylogenies whose underlying evolutionary patterns deviate substantially from current modelling assumptions and should guide future methods development.
SummaryChromosome number is a central feature of eukaryote genomes. Deciphering patterns of chromosome-number change along a phylogeny is central to the inference of whole genome duplications and ancestral chromosome numbers. ChromEvol is a probabilistic inference tool that allows the evaluation of several models of chromosome-number evolution and their fit to the data. However, fitting a model does not necessarily mean that the model describes the empirical data adequately. This vulnerability may lead to incorrect conclusions when model assumptions are not met by real data.Here, we present a model adequacy test for likelihood models of chromosome-number evolution. The procedure allows to determine whether the model can generate data with similar characteristics as those found in the observed ones.We demonstrate that using inadequate models can lead to inflated errors in several inference tasks. Applying the developed method to 200 angiosperm genera, we find that in many of these, the best-fitted model provides poor fit to the data. The inadequacy rate increases in large clades or in those in which hybridizations are present.The developed model adequacy test can help researchers to identify phylogenies whose underlying evolutionary patterns deviate substantially from current modelling assumptions and should guide future methods developments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.