Within uncharacterized groups, DNA barcodes, short DNA sequences that are present in a wide range of species, can be used to assign organisms into species. We propose an automatic procedure that sorts the sequences into hypothetical species based on the barcode gap, which can be observed whenever the divergence among organisms belonging to the same species is smaller than divergence among organisms from different species. We use a range of prior intraspecific divergence to infer from the data a model-based one-sided confidence limit for intraspecific divergence. The method, called Automatic Barcode Gap Discovery (ABGD), then detects the barcode gap as the first significant gap beyond this limit and uses it to partition the data. Inference of the limit and gap detection are then recursively applied to previously obtained groups to get finer partitions until there is no further partitioning. Using six published data sets of metazoans, we show that ABGD is computationally efficient and performs well for standard prior maximum intraspecific divergences (a few per cent of divergence for the five data sets), except for one data set where less than three sequences per species were sampled. We further explore the theoretical limitations of ABGD through simulation of explicit speciation and population genetics scenarios. Our results emphasize in particular the sensitivity of the method to the presence of recent speciation events, via (unrealistically) high rates of speciation or large numbers of species. In conclusion, ABGD is fast, simple method to split a sequence alignment data set into candidate species that should be complemented with other evidence in an integrative taxonomic approach.
Here, we describe Assemble Species by Automatic Partitioning (ASAP), a new method to build species partitions from single locus sequence alignments (i.e., barcode data sets). ASAP is efficient enough to split data sets as large 104 sequences into putative species in several minutes. Although grounded in evolutionary theory, ASAP is the implementation of a hierarchical clustering algorithm that only uses pairwise genetic distances, avoiding the computational burden of phylogenetic reconstruction. Importantly, ASAP proposes species partitions ranked by a new scoring system that uses no biological prior insight of intraspecific diversity. ASAP is a stand‐alone program that can be used either through a graphical web‐interface or that can be downloaded and compiled for local usage. We have assessed its power along with three others programs (ABGD, PTP and GMYC) on 10 real COI barcode data sets representing various degrees of challenge (from small and easy cases to large and complicated data sets). We also used Monte‐Carlo simulations of a multispecies coalescent framework to assess the strengths and weaknesses of ASAP and the other programs. Through these analyses, we demonstrate that ASAP has the potential to become a major tool for taxonomists as it proposes rapidly in a full graphical exploratory interface relevant species hypothesis as a first step of the integrative taxonomy process.
A full list of authors and affiliations appears at the end of the paper.Purpose: To define the phenotypic and mutational spectrum of epilepsies related to DEPDC5, NPRL2 and NPRL3 genes encoding the GATOR1 complex, a negative regulator of the mTORC1 pathway Methods:We analyzed clinical and genetic data of 73 novel probands (familial and sporadic) with epilepsy-related variants in GATOR1-encoding genes and proposed new guidelines for clinical interpretation of GATOR1 variants.Results: The GATOR1 seizure phenotype consisted mostly in focal seizures (e.g., hypermotor or frontal lobe seizures in 50%), with a mean age at onset of 4.4 years, often sleep-related and drugresistant (54%), and associated with focal cortical dysplasia (20%). Infantile spasms were reported in 10% of the probands. Sudden unexpected death in epilepsy (SUDEP) occurred in 10% of the families. Novel classification framework of all 140 epilepsy-related GATOR1 variants (including the variants of this study) revealed that 68% are loss-of-function pathogenic, 14% are likely pathogenic, 15% are variants of uncertain significance and 3% are likely benign.Conclusion: Our data emphasize the increasingly important role of GATOR1 genes in the pathogenesis of focal epilepsies (>180 probands to date). The GATOR1 phenotypic spectrum ranges from sporadic early-onset epilepsies with cognitive impairment comorbidities to familial focal epilepsies, and SUDEP.Genetics in Medicine (2018) https://doi
Flavobacterium psychrophilum is currently one of the main bacterial pathogens hampering the productivity of salmonid farming worldwide, and its control mainly relies on antibiotic treatments. To better understand the population structure of this bacterium and its mode of evolution, we have examined the nucleotide polymorphisms at 11 protein-coding loci of the core genome in a set of 50 isolates. These isolates were selected to represent the broadest possible diversity, originating from 10 different host fish species and four continents. The nucleotide diversity between pairs of sequences amounted to fewer than four differences per kilobase on average, corresponding to a particularly low level of diversity, possibly indicative of a small effective-population size. The recombination rate, however, seemed remarkably high, and as a consequence, most of the isolates harbored unique combinations of alleles (33 distinct sequence types were resolved). The analysis also showed the existence of several clonal complexes with worldwide geographic distribution but marked association with particular fish species. Such an association could reflect preferential routes of transmission and/or adaptive niche specialization. The analysis provided no clues that the initial range of the bacterium was originally limited to North America. Instead, the historical record of the expansion of the pathogen may reflect the spread of a few clonal complexes. As a resource for future epidemiological surveys, a multilocus sequence typing website based on seven highly informative loci is available.
The main familial focal epilepsies of childhood are autosomal dominant nocturnal frontal lobe epilepsy, familial temporal lobe epilepsy and familial focal epilepsy with variable foci. A frameshift mutation in the DEPDC5 (DEP domain containing protein 5) gene was identified in a family with focal epilepsy with variable foci, by linkage analysis and exome sequencing. Subsequent pyrosequencing of DEPDC5 in a cohort of 15 additional families with focal epilepsies revealed four nonsense and one missense mutations. Our findings provided evidence for frequent (37%) loss-of-function mutations in DEPDC5 associated with a broad spectrum of focal epilepsies. The implication of a DEP domain (Dishevelled, Egl-10 and Pleckstrin domain)-containing protein that may be involved in membrane trafficking and/or G-protein signaling, opens new avenues for research.
New genes, with novel protein functions, can evolve "from scratch" out of intergenic sequences. These de novo genes can integrate the cell's genetic network and drive important phenotypic innovations. Therefore, identifying de novo genes and understanding how the transition from noncoding to coding occurs are key problems in evolutionary biology. However, identifying de novo genes is a difficult task, hampered by the presence of remote homologs, fast evolving sequences and erroneously annotated protein coding genes. To overcome these limitations, we developed a procedure that handles the usual pitfalls in de novo gene identification and predicted the emergence of 703 de novo gene candidates in 15 yeast species from 2 genera whose phylogeny spans at least 100 million years of evolution. We validated 85 candidates by proteomic data, providing new translation evidence for 25 of them through mass spectrometry experiments. We also unambiguously identified the mutations that enabled the transition from noncoding to coding for 30 Saccharomyces de novo genes. We established that de novo gene origination is a widespread phenomenon in yeasts, only a few being ultimately maintained by selection. We also found that de novo genes preferentially emerge next to divergent promoters in GC-rich intergenic regions where the probability of finding a fortuitous and transcribed ORF is the highest. Finally, we found a more than 3-fold enrichment of de novo genes at recombination hot spots, which are GC-rich and nucleosome-free regions, suggesting that meiotic recombination contributes to de novo gene emergence in yeasts.
Neutrality tests based on the frequency spectrum (e.g., Tajima's D or Fu and Li's F ) are commonly used by population geneticists as routine tests to assess the goodness-of-fit of the standard neutral model on their data sets. Here, I show that these neutrality tests are specific instances of a general model that encompasses them all. I illustrate how this general framework can be taken advantage of to devise new more powerful tests that better detect deviations from the standard model. Finally, I exemplify the usefulness of the framework on SNP data by showing how it supports the selection hypothesis in the lactase human gene by overcoming the ascertainment bias. The framework presented here paves the way for constructing novel tests optimized for specific violations of the standard model that ultimately will help to unravel scenarios of evolution.
Since the 1980s, many have suggested we are in the midst of a massive extinction crisis, yet only 799 (0.04%) of the 1.9 million known recent species are recorded as extinct, questioning the reality of the crisis. This low figure is due to the fact that the status of very few invertebrates, which represent the bulk of biodiversity, have been evaluated. Here we show, based on extrapolation from a random sample of land snail species via two independent approaches, that we may already have lost 7% (130,000 extinctions) of the species on Earth. However, this loss is masked by the emphasis on terrestrial vertebrates, the target of most conservation actions. Projections of species extinction rates are controversial because invertebrates are essentially excluded from these scenarios. Invertebrates can and must be assessed if we are to obtain a more realistic picture of the sixth extinction crisis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.