IQ-TREE (http://www.iqtree.org, last accessed February 6, 2020) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.
IQ-TREE (http://www.iqtree.org) is a user-friendly and widely used software package for phylogenetic inference using maximum likelihood. Since the release of version 1 in 2014, we have continuously expanded IQ-TREE to integrate a plethora of new models of sequence evolution and efficient computational approaches of phylogenetic inference to deal with genomic data. Here, we describe notable features of IQ-TREE version 2 and highlight the key advantages over other software.
Precise estimations of molecular rates are fundamental to our understanding of the processes of evolution. In principle, mutation and evolutionary rates for neutral regions of the same species are expected to be equal. However, a number of recent studies have shown that mutation rates estimated from pedigree material are much faster than evolutionary rates measured over longer time periods. To resolve this apparent contradiction, we have examined the hypervariable region (HVR I) of the mitochondrial genome using families of Adélie penguins (Pygoscelis adeliae) from the Antarctic. We sequenced 344 bps of the HVR I from penguins comprising 508 families with 915 chicks, together with both their parents. All of the 62 germline heteroplasmies that we detected in mothers were also detected in their offspring, consistent with maternal inheritance. These data give an estimated mutation rate (μ) of 0.55 mutations/site/Myrs (HPD 95% confidence interval of 0.29–0.88 mutations/site/Myrs) after accounting for the persistence of these heteroplasmies and the sensitivity of current detection methods. In comparison, the rate of evolution (k) of the same HVR I region, determined using DNA sequences from 162 known age sub-fossil bones spanning a 37,000-year period, was 0.86 substitutions/site/Myrs (HPD 95% confidence interval of 0.53 and 1.17). Importantly, the latter rate is not statistically different from our estimate of the mutation rate. These results are in contrast to the view that molecular rates are time dependent.
Continuous-time Markov chains are a standard tool in phylogenetic inference. If homogeneity is assumed, the chain is formulated by specifying time-independent rates of substitutions between states in the chain. In applications, there are usually extra constraints on the rates, depending on the situation. If a model is formulated in this way, it is possible to generalise it and allow for an inhomogeneous process, with time-dependent rates satisfying the same constraints. It is then useful to require that there exists a homogeneous average of this inhomogeneous process within the same model. This leads to the definition of "Lie Markov models", which are precisely the class of models where such an average exists. These models form Lie algebras and hence concepts from Lie group theory are central to their derivation. In this paper, we concentrate on applications to phylogenetics and nucleotide evolution, and derive the complete hierarchy of Lie Markov models that respect the grouping of nucleotides into purines and pyrimidines -that is, models with purine/pyrimidine symmetry. We also discuss how to handle the subtleties of applying Lie group methods, most naturally defined over the complex field, to the stochastic case of a Markov process, where parameter values are restricted to be real and positive. In particular, we explore the geometric embedding of the cone of stochastic rate matrices within the ambient space of the associated complex Lie algebra.The whole list of Lie Markov models with purine/pyrimidine symmetry is available at
It has recently been observed by Ho et al. (Ho SYW, Phillips MJ, Cooper A, Drummond AJ. 2005. Time dependency of molecular rate estimates and systematic overestimation of recent divergence times. Mol Biol Evol. 22(7):1561-1568) that apparent rates of molecular evolution increase when measured over short timespans. I investigate whether the data are explainable purely by deleterious mutations. I derive an empirical approximation for the persistence of these mutations in a randomly mating population and, hence, derive lower limits on effective population sizes. These limits are high and get higher if additional reasonable assumptions are made. This casts doubt on whether deleterious mutations are able to explain the apparent rate acceleration.
When the process underlying DNA substitutions varies across evolutionary history, some standard Markov models underlying phylogenetic methods are mathematically inconsistent. The most prominent example is the general time-reversible model (GTR) together with some, but not all, of its submodels. To rectify this deficiency, nonhomogeneous Lie Markov models have been identified as the class of models that are consistent in the face of a changing process of DNA substitutions regardless of taxon sampling. Some well-known models in popular use are within this class, but are either overly simplistic (e.g., the Kimura two-parameter model) or overly complex (the general Markov model). On a diverse set of biological data sets, we test a hierarchy of Lie Markov models spanning the full range of parameter richness. Compared against the benchmark of the ever-popular GTR model, we find that as a whole the Lie Markov models perform well, with the best performing models having 8–10 parameters and the ability to recognize the distinction between purines and pyrimidines.
Background: Within eukaryotes there is a complex cascade of RNA-based macromolecules that process other RNA molecules, especially mRNA, tRNA and rRNA. An example is RNase MRP processing ribosomal RNA (rRNA) in ribosome biogenesis. One hypothesis is that this complexity was present early in eukaryotic evolution; an alternative is that an initial simpler network later gained complexity by gene duplication in lineages that led to animals, fungi and plants. Recently there has been a rapid increase in support for the complexity-early theory because the vast majority of these RNA-processing reactions are found throughout eukaryotes, and thus were likely to be present in the last common ancestor of living eukaryotes, herein called the Eukaryotic Ancestor.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.