Phylogenetic estimation has largely come to rely on explicitly model-based methods. This approach requires that a model be chosen and that that choice be justified. To date, justification has largely been accomplished through use of likelihood-ratio tests (LRTs) to assess the relative fit of a nested series of reversible models. While this approach certainly represents an important advance over arbitrary model selection, the best fit of a series of models may not always provide the most reliable phylogenetic estimates for finite real data sets, where all available models are surely incorrect. Here, we develop a novel approach to model selection, which is based on the Bayesian information criterion, but incorporates relative branch-length error as a performance measure in a decision theory (DT) framework. This DT method includes a penalty for overfitting, is applicable prior to running extensive analyses, and simultaneously compares all models being considered and thus does not rely on a series of pairwise comparisons of models to traverse model space. We evaluate this method by examining four real data sets and by using those data sets to define simulation conditions. In the real data sets, the DT method selects the same or simpler models than conventional LRTs. In order to lend generality to the simulations, codon-based models (with parameters estimated from the real data sets) were used to generate simulated data sets, which are therefore more complex than any of the models we evaluate. On average, the DT method selects models that are simpler than those chosen by conventional LRTs. Nevertheless, these simpler models provide estimates of branch lengths that are more accurate both in terms of relative error and absolute error than those derived using the more complex (yet still wrong) models chosen by conventional LRTs. This method is available in a program called DT-ModSel.
▪ Abstract Investigation into model selection has a long history in the statistical literature. As model-based approaches begin dominating systematic biology, increased attention has focused on how models should be selected for distance-based, likelihood, and Bayesian phylogenetics. Here, we review issues that render model-based approaches necessary, briefly review nucleotide-based models that attempt to capture relevant features of evolutionary processes, and review methods that have been applied to model selection in phylogenetics: likelihood-ratio tests, AIC, BIC, and performance-based approaches.
We examine the evolution of mesic forest ecosystems in the Pacific Northwest of North America using a statistical phylogeography approach in four animal and two plant lineages. Three a priori hypotheses, which explain the disjunction in the mesic forest ecosystem with either recent dispersal or ancient vicariance, are tested with phylogenetic and coalescent methods. We find strong support in three amphibian lineages (Ascaphus spp., and Dicampton spp., and Plethodon vandykei and P. idahoensis) for deep divergence between coastal and inland populations, as predicted by the ancient vicariance hypothesis. Unlike the amphibians, the disjunction in other Pacific Northwest lineages is likely due to recent dispersal along a northern route. Topological and population divergence tests support the northern dispersal hypothesis in the water vole (Microtus richardsoni) and northern dispersal has some support in both the dusky willow (Salix melanopsis) and whitebark pine (Pinus albicaulis). These analyses demonstrate that genetic data sampled from across an ecosystem can provide insight into the evolution of ecological communities and suggest that the advantages of a statistical phylogeographic approach are most pronounced in comparisons across multiple taxa in a particular ecosystem. Genetic patterns in organisms as diverse as willows and salamanders can be used to test general regional hypotheses, providing a consistent metric for comparison among members of an ecosystem with disparate life-history traits.
We investigated evolutionary relationships among deuterostome subgroups by obtaining nearly complete large-subunit ribosomal RNA (LSU rRNA)-gene sequences for 14 deuterostomes and 3 protostomes and complete small-subunit (SSU) rRNA-gene sequences for five of these animals. With the addition of previously published sequences, we compared 28 taxa using three different data sets (LSU only, SSU only, and combined LSU + SSU) under minimum evolution (with LogDet distances), maximum likelihood, and maximum parsimony optimality criteria. Additionally, we analyzed the combined LSU + SSU sequences with spectral analysis of LogDet distances, a technique that measures the amount of support and conflict within the data for every possible grouping of taxa. Overall, we found that (1) the LSU genes produced a tree very similar to the SSU gene tree, (2) adding LSU to SSU sequences strengthened the bootstrap support for many groups above the SSU-only values (e.g., hemichordates plus echinoderms as Ambulacraria; lancelets as the sister group to vertebrates), (3) LSU sequences did not support SSU-based hypotheses of pterobranchs evolving from enteropneusts and thaliaceans evolving from ascidians, and (4) the combined LSU + SSU data are ambiguous about the monophyly of chordates. No tree-building algorithm united urochordates conclusively with other chordates, although spectral analysis did so, providing our only evidence for chordate monophyly. With spectral analysis, we also evaluated several major hypotheses of deuterostome phylogeny that were constructed from morphological, embryological, and paleontological evidence. Our rRNA-gene analysis refutes most of these hypotheses and thus advocates a rethinking of chordate and vertebrate origins.
The phylogeography of Sumichrast's harvest mouse (Reithrodontomys sumichrasti) was examined through maximum-likelihood and parsimony analyses of 1,130 bp of mitochondrial Cytochrome b sequence data from 43 individuals. The phylogeography of this Middle American highland forest-dwelling species was compared to that previously published for the codistributed Aztec deer mouse complex (Peromyscus aztecus/Peromyscus hylocetes complex) in order to test competing hypotheses of concerted versus independent responses of codistributed forms to past climatic fluctuations. Qualitatively, there were strong similarities in the phylogeographic patterns of the two groups, yet there were also areas of incongruence. Likelihood-ratio tests (Kishino-Hasegawa-Templeton and parametric bootstrap tests) indicated that this incongruence is significant and cannot be attributed simply to uncertainty in phylogenetic estimation, thereby falsifying the concerted-response hypothesis. Conversely, tree-reconciliation analysis of the area relationships inferred for each group separately indicated that there has been a significant history of covicariance between the two groups, falsifying the independent-response hypothesis. It appears that codistributed taxa in the geologically complex highlands of Mesoamerica share more common biogeographical history than can be accounted for by the independent-response hypothesis yet have not responded to past climatic fluctuations in the lock-step fashion predicted by the concerted-response hypothesis.
Models that posit speciation in the face of gene flow are replacing classical views that hybridization is rare between animal species. We use a multilocus approach to examine the history of hybridization and gene flow between two species of chipmunks (Tamias ruficaudus and T. amoenus). Previous studies have shown that these species occupy different ecological niches and have distinct genital bone morphologies, yet appear to be incompletely isolated reproductively in multiple areas of sympatry. We compared data from four sequenced nuclear loci and from seven microsatellite loci to published cytochrome b sequences. Interspecific gene flow was primarily restricted to introgression of the T. ruficaudus mitochondrial genome into a sympatric subspecies of T. amoenus, T. a. canicaudus, with the four sequenced nuclear loci showing little to no interspecific allele sharing. Microsatellite data were consistent with high levels of differentiation between the species and also showed no current gene flow between broadly sympatric populations of T. a. canicaudus and T. ruficaudus. Coalescent analyses date the mtDNA introgression event from the mid-Pleistocene to late Pliocene. Overall, these data indicate that introgression has had a minimal impact on the nuclear genomes of T. amoenus and T. ruficaudus despite multiple independent hybridization events. Our findings challenge long-standing assumptions on patterns of reproductive isolation in chipmunks and suggest that there may be other examples of hybridization among the 23 species of Tamias that occur in western North America.
Abstract. The sequence of the mitochondrial COII gene has been widely used to estimate phylogenetic relationships at different taxomonic levels across insects. We investigated the molecular evolution of the COII gene and its usefulness for reconstructing phylogenetic relationships within and among four collembolan families. The collembolan COII gene showed the lowest A + T content of all insects so far examined, confirming that the well-known A + T bias in insect mitochondrial genes tends to increase from the basal to apical orders. Fiftyseven percent of all nucleotide positions were variable and most of the third codon positions appeared free to vary. Values of genetic distance between congeneric species and between families were remarkably high; in some cases the latter were higher than divergence values between other orders of insects. The remarkably high divergence levels observed here provide evidence that collembolan taxa are quite old; divergence levels among collembolan families equaled or exceeded divergences among pterygote insect orders. Once the saturated thirdcodon positions (which violated stationarity of base frequencies) were removed, the COII sequences contained phylogenetic information, but the extent of that information was overestimated by parsimony methods relative to likelihood methods. In the phylogenetic analysis, consistent statistical support was obtained for the monophyly of all four genera examined, but relationships among genera/families were not well supported. Within the genus Orchesella, relationships were well resolved and agreed with allozyme data. Within the genus Isotomurus, although three pairs of populations were consistently identified, these appeared to have arisen in a burst of evolution from an earlier ancestor. Isotomurus italicus always appeared as basal and I. palustris appeared to harbor a cryptic species, corroborating allozyme data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.