Species delimitation is the act of identifying species-level biological diversity. In recent years, the field has witnessed a dramatic increase in the number of methods available for delimiting species. However, most recent investigations only utilize a handful (i.e. 2-3) of the available methods, often for unstated reasons. Because the parameter space that is potentially relevant to species delimitation far exceeds the parameterization of any existing method, a given method necessarily makes a number of simplifying assumptions, any one of which could be violated in a particular system. We suggest that researchers should apply a wide range of species delimitation analyses to their data and place their trust in delimitations that are congruent across methods. Incongruence across the results from different methods is evidence of either a difference in the power to detect cryptic lineages across one or more of the approaches used to delimit species and could indicate that assumptions of one or more of the methods have been violated. In either case, the inferences drawn from species delimitation studies should be conservative, for in most contexts it is better to fail to delimit species than it is to falsely delimit entities that do not represent actual evolutionary lineages.
The conservation status of most plant species is currently unknown, despite the fundamental role of plants in ecosystem health. To facilitate the costly process of conservation assessment, we developed a predictive protocol using a machine-learning approach to predict conservation status of over 150,000 land plant species. Our study uses open-source geographic, environmental, and morphological trait data, making this the largest assessment of conservation risk to date and the only global assessment for plants. Our results indicate that a large number of unassessed species are likely at risk and identify several geographic regions with the highest need of conservation efforts, many of which are not currently recognized as regions of global concern. By providing conservation-relevant predictions at multiple spatial and taxonomic scales, predictive frameworks such as the one developed here fill a pressing need for biodiversity science.
Empirical phylogeographic studies have progressively sampled greater numbers of loci over time, in part motivated by theoretical papers showing that estimates of key demographic parameters improve as the number of loci increases. Recently, next-generation sequencing has been applied to questions about organismal history, with the promise of revolutionizing the field. However, no systematic assessment of how phylogeographic data sets have changed over time with respect to overall size and information content has been performed. Here, we quantify the changing nature of these genetic data sets over the past 20 years, focusing on papers published in Molecular Ecology. We found that the number of independent loci, the total number of alleles sampled and the total number of single nucleotide polymorphisms (SNPs) per data set has improved over time, with particularly dramatic increases within the past 5 years. Interestingly, uniparentally inherited organellar markers (e.g. animal mitochondrial and plant chloroplast DNA) continue to represent an important component of phylogeographic data. Singlespecies studies (cf. comparative studies) that focus on vertebrates (particularly fish and to some extent, birds) represent the gold standard of phylogeographic data collection. Based on the current trajectory seen in our survey data, forecast modelling indicates that the median number of SNPs per data set for studies published by the end of the year 2016 may approach~20 000. This survey provides baseline information for understanding the evolution of phylogeographic data sets and underscores the fact that development of analytical methods for handling very large genetic data sets will be critical for facilitating growth of the field.Keywords: DNA sequences, information content, phylogeography, sampling, single nucleotide polymorphisms, temporal trends IntroductionPhylogeographers have been working to collect multilocus data ever since a series of theoretical papers pertinent to the discipline demonstrated that estimates of key demographic parameters improve as the number of loci increases (e.g. Edwards & Beerli 2000;Hey & Nielsen 2004;Felsenstein 2006;Carling & Brumfield 2007). Recent improvements in DNA sequencing technology have led to platforms with greater speed, resolution and/or output (e.g. Margulies et al. 2005;Bentley et al. 2008;Rothberg et al. 2011) when compared to the traditional Sanger method. These technological advances, together with the development of general-purpose protocols for discovering and screening many DNA sequence polymorphisms arrayed across a species' genome (e.g. Baird et al. 2008;Kerstens et al. 2009;Faircloth et al. 2012;Peterson et al. 2012), are transforming the field of phylogeography to one that is no longer data limited. Investigations concerned with reconstructing long-term population history generally require large numbers of sampled alleles (i.e. many individuals and populations), across multiple loci, to adequately characterize levels of diversity and spatial genetic structuring (McCor...
Model checking is a critical part of Bayesian data analysis, yet it remains largely unused in systematic studies. Phylogeny estimation has recently moved into an era of increasingly complex models that simultaneously account for multiple evolutionary processes, the statistical fit of these models to the data has rarely been tested. Here we develop a posterior predictive simulation-based model check for a commonly used multispecies coalescent model, implemented in *BEAST, and apply it to 25 published data sets. We show that poor model fit is detectable in the majority of data sets; that this poor fit can mislead phylogenetic estimation; and that in some cases it stems from processes of inherent interest to systematists. We suggest that as systematists scale up to phylogenomic data sets, which will be subject to a heterogeneous array of evolutionary processes, critically evaluating the fit of models to data is an analytical step that can no longer be ignored.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.