The reconstruction of the Tree of Life has relied almost entirely on concatenation methods, which do not accommodate gene tree heterogeneity, a property that simulations and theory have identified as a likely cause of incongruent phylogenies. However, this incongruence has not yet been demonstrated in empirical studies. Several key relationships among eutherian mammals remain controversial and conflicting among previous studies, including the root of eutherian tree and the relationships within Euarchontoglires and Laurasiatheria. Both Bayesian and maximum-likelihood analysis of genome-wide data of 447 nuclear genes from 37 species show that concatenation methods indeed yield strong incongruence in the phylogeny of eutherian mammals, as revealed by subsampling analyses of loci and taxa, which produced strongly conflicting topologies. In contrast, the coalescent methods, which accommodate gene tree heterogeneity, yield a phylogeny that is robust to variable gene and taxon sampling and is congruent with geographic data. The data also demonstrate that incomplete lineage sorting, a major source of gene tree heterogeneity, is relevant to deep-level phylogenies, such as those among eutherian mammals. Our results firmly place the eutherian root between Atlantogenata and Boreoeutheria and support ungulate polyphyly and a sister-group relationship between Scandentia and Primates. This study demonstrates that the incongruence introduced by concatenation methods is a major cause of long-standing uncertainty in the phylogeny of eutherian mammals, and the same may apply to other clades. Our analyses suggest that such incongruence can be resolved using phylogenomic data and coalescent methods that deal explicitly with gene tree heterogeneity.
In recent articles published in Molecular Phylogenetics and Evolution, Mark Springer and John Gatesy (S&G) present numerous criticisms of recent implementations and testing of the multispecies coalescent (MSC) model in phylogenomics, popularly known as "species tree" methods. After pointing out errors in alignments and gene tree rooting in recent phylogenomic data sets, particularly in Song et al. (2012) on mammals and Xi et al. (2014) on plants, they suggest that these errors seriously compromise the conclusions of these studies. Additionally, S&G enumerate numerous perceived violated assumptions and deficiencies in the application of the MSC model in phylogenomics, such as its assumption of neutrality and in particular the use of transcriptomes, which are deemed inappropriate for the MSC because the constituent exons often subtend large regions of chromosomes within which recombination is substantial. We acknowledge these previously reported errors in recent phylogenomic data sets, but disapprove of S&G's excessively combative and taunting tone. We show that these errors, as well as two nucleotide sorting methods used in the analysis of Amborella, have little impact on the conclusions of those papers. Moreover, several concepts introduced by S&G and an appeal to "first principles" of phylogenetics in an attempt to discredit MSC models are invalid and reveal numerous misunderstandings of the MSC. Contrary to the claims of S&G we show that recent computer simulations used to test the robustness of MSC models are not circular and do not unfairly favor MSC models over concatenation. In fact, although both concatenation and MSC models clearly perform well in regions of tree space with long branches and little incomplete lineage sorting (ILS), simulations reveal the erratic behavior of concatenation when subjected to data subsampling and its tendency to produce spuriously confident yet conflicting results in regions of parameter space where MSC models still perform well. S&G's claims that MSC models explain little or none (0-15%) of the observed gene tree heterogeneity observed in a mammal data set and that MSC models assume ILS as the only source of gene tree variation are flawed. Overall many of their criticisms of MSC models are invalidated when concatenation is appropriately viewed as a special case of the MSC, which in turn is a special case of emerging network models in phylogenomics. We reiterate that there is enormous promise and value in recent implementations and tests of the MSC and look forward to its increased use and refinement in phylogenomics.
As researchers collect increasingly large molecular data sets to reconstruct the Tree of Life, the heterogeneity of signals in the genomes of diverse organisms poses challenges for traditional phylogenetic analysis. A class of phylogenetic methods known as "species tree methods" have been proposed to directly address one important source of gene tree heterogeneity, namely the incomplete lineage sorting or deep coalescence that occurs when evolving lineages radiate rapidly, resulting in a diversity of gene trees from a single underlying species tree. Although such methods are gaining in popularity, they are being adopted with caution in some quarters, in part because of an increasing number of examples of strong phylogenetic conflict between concatenation or supermatrix methods and species tree methods. Here we review theory and empirical examples that help clarify these conflicts. Thinking of concatenation as a special case of the more general model provided by the multispecies coalescent can help explain a number of differences in the behavior of the two methods on phylogenomic data sets. Recent work suggests that species tree methods are more robust than concatenation approaches to some of the classic challenges of phylogenetic analysis, including rapidly evolving sites in DNA sequences, base compositional heterogeneity and long branch attraction. We show that approaches such as binning, designed to augment the signal in species tree analyses, can distort the distribution of gene trees and are inconsistent. Computationally efficient species tree methods that incorporate biological realism are a key to phylogenetic analysis of whole genome data.Comment: 39 pages, 3 figure
The timing of the diversification of placental mammals relative to the Cretaceous-Paleogene (KPg) boundary mass extinction remains highly controversial. In particular, there have been seemingly irreconcilable differences in the dating of the early placental radiation not only between fossil-based and molecular datasets but also among molecular datasets. To help resolve this discrepancy, we performed genome-scale analyses using 4,388 loci from 90 taxa, including representatives of all extant placental orders and transcriptome data from flying lemurs (Dermoptera) and pangolins (Pholidota). Depending on the gene partitioning scheme, molecular clock model, and genic deviation from molecular clock assumptions, extensive sensitivity analyses recovered widely varying diversification scenarios for placental mammals from a given gene set, ranging from a deep Cretaceous origin and diversification to a scenario spanning the KPg boundary, suggesting that the use of suboptimal molecular clock markers and methodologies is a major cause of controversies regarding placental diversification timing. We demonstrate that reconciliation between molecular and paleontological estimates of placental divergence times can be achieved using the appropriate clock model and gene partitioning scheme while accounting for the degree to which individual genes violate molecular clock assumptions. A birth-death-shift analysis suggests that placental mammals underwent a continuous radiation across the KPg boundary without apparent interruption by the mass extinction, paralleling a genus-level radiation of multituberculates and ecomorphological diversification of both multituberculates and therians. These findings suggest that the KPg catastrophe evidently played a limited role in placental diversification, which, instead, was likely a delayed response to the slightly earlier radiation of angiosperms.
The earliest evolution of mammals and origins of mammalian features can be traced to the mammaliaforms of the Triassic and Jurassic periods that are extinct relatives to living mammals. Here we describe a new fossil from the Middle Jurassic that has a mandibular middle ear, a gradational transition of thoracolumbar vertebrae and primitive ankle features, but highly derived molars with a high crown and multiple roots that are partially fused. The upper molars have longitudinal cusp rows that occlude alternately with those of the lower molars. This specialization for masticating plants indicates that herbivory evolved among mammaliaforms, before the rise of crown mammals. The new species shares the distinctive dental features of the eleutherodontid clade, previously represented only by isolated teeth despite its extensive geographic distribution during the Jurassic. This eleutherodontid was terrestrial and had ambulatory gaits, analogous to extant terrestrial mammals such as armadillos or rock hyrax. Its fur corroborates that mammalian integument had originated well before the common ancestor of living mammals.
Genome-scale sequence data have become increasingly available in the phylogenetic studies for understanding the evolutionary histories of species. However, it is challenging to develop probabilistic models to account for heterogeneity of phylogenomic data. The multispecies coalescent model describes gene trees as independent random variables generated from a coalescence process occurring along the lineages of the species tree. Since the multispecies coalescent model allows gene trees to vary across genes, coalescent-based methods have been popularly used to account for heterogeneous gene trees in phylogenomic data analysis. In this paper, we summarize and evaluate the performance of coalescent-based methods for estimating species trees from genomescale sequence data. We investigate the effects of deep coalescence and mutation on the performance of species tree estimation methods. We found that the coalescent-based methods perform well in estimating species trees for a large number of genes, regardless of the degree of deep coalescence and mutation. The performance of the coalescent methods is negatively correlated with the lengths of internal branches of the species tree.
The nomogram developed in this study demonstrated its discrimination capability for predicting 3- and 5-year occurrence of brain metastases, and can be used to identify high-risk patients.
The timing of the origin and diversification of rodents remains controversial, due to conflicting results from molecular clocks and paleontological data. The fossil record tends to support an early Cenozoic origin of crown-group rodents. In contrast, most molecular studies place the origin and initial diversification of crown-Rodentia deep in the Cretaceous, although some molecular analyses have recovered estimated divergence times that are more compatible with the fossil record. Here we attempt to resolve this conflict by carrying out a molecular clock investigation based on a nine-gene sequence dataset and a novel set of seven fossil constraints, including two new rodent records (the earliest known representatives of Cardiocraniinae and Dipodinae). Our results indicate that rodents originated around 61.7–62.4 Ma, shortly after the Cretaceous/Paleogene (K/Pg) boundary, and diversified at the intraordinal level around 57.7–58.9 Ma. These estimates are broadly consistent with the paleontological record, but challenge previous molecular studies that place the origin and early diversification of rodents in the Cretaceous. This study demonstrates that, with reliable fossil constraints, the incompatibility between paleontological and molecular estimates of rodent divergence times can be eliminated using currently available tools and genetic markers. Similar conflicts between molecular and paleontological evidence bedevil attempts to establish the origination times of other placental groups. The example of the present study suggests that more reliable fossil calibration points may represent the key to resolving these controversies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.