Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this, a large number of specialized simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and the tskit library. We summarize msprime’s many features, and show that its performance is excellent, often many times faster and more memory efficient than specialized alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.
Speciation genomic studies aim to interpret patterns of genome-wide variation in light of the processes that give rise to new species. However, interpreting the genomic “landscape” of speciation is difficult, because many evolutionary processes can impact levels of variation. Facilitated by the first chromosome-level assembly for the group, we use whole-genome sequencing and simulations to shed light on the processes that have shaped the genomic landscape during a radiation of monkeyflowers. After inferring the phylogenetic relationships among the 9 taxa in this radiation, we show that highly similar diversity (π) and differentiation ( F ST ) landscapes have emerged across the group. Variation in these landscapes was strongly predicted by the local density of functional elements and the recombination rate, suggesting that the landscapes have been shaped by widespread natural selection. Using the varying divergence times between pairs of taxa, we show that the correlations between F ST and genome features arose almost immediately after a population split and have become stronger over time. Simulations of genomic landscape evolution suggest that background selection (BGS; i.e., selection against deleterious mutations) alone is too subtle to generate the observed patterns, but scenarios that involve positive selection and genetic incompatibilities are plausible alternative explanations. Finally, tests for introgression among these taxa reveal widespread evidence of heterogeneous selection against gene flow during this radiation. Combined with previous evidence for adaptation in this system, we conclude that the correlation in F ST among these taxa informs us about the processes contributing to adaptation and speciation during a rapid radiation.
Speciation genomic studies aim to interpret patterns of genome-wide variation in light of the processes that give rise to new species. However, interpreting the genomic ‘landscape’ of speciation is difficult, because many evolutionary processes can impact levels of variation. Facilitated by the first chromosome-level assembly for the group, we use whole-genome sequencing and simulations to shed light on the processes that have shaped the genomic landscape during a recent radiation of monkeyflowers. After inferring the phylogenetic relationships among the nine taxa in this radiation, we show that highly similar diversity (π) and differentiation (FST) landscapes have emerged across the group. Variation in these landscapes was strongly predicted by the local density of functional elements and the recombination rate, suggesting that the landscapes have been shaped by widespread natural selection. Using the varying divergence times between pairs of taxa, we show that the correlations between FST and genome features arose almost immediately after a population split and have become stronger over time. Simulations of genomic landscape evolution suggest that background selection (i.e., selection against deleterious mutations) alone is too subtle to generate the observed patterns, but scenarios that involve positive selection and genetic incompatibilities are plausible alternative explanations. Finally, tests for introgression among these taxa reveal widespread evidence of heterogeneous selection against gene flow during this radiation. Thus, combined with existing evidence for adaptation in this system, we conclude that the correlation in FST among these taxa informs us about the genomic basis of adaptation and speciation in this system.Author summaryWhat can patterns of genome-wide variation tell us about the speciation process? The answer to this question depends upon our ability to infer the evolutionary processes underlying these patterns. This, however, is difficult, because many processes can leave similar footprints, but some have nothing to do with speciation per se. For example, many studies have found highly heterogeneous levels of genetic differentiation when comparing the genomes of emerging species. These patterns are often referred to as differentiation ‘landscapes’ because they appear as a rugged topography of ‘peaks’ and ‘valleys’ as one scans across the genome. It has often been argued that selection against deleterious mutations, a process referred to as background selection, is primarily responsible for shaping differentiation landscapes early in speciation. If this hypothesis is correct, then it is unlikely that patterns of differentiation will reveal much about the genomic basis of speciation. However, using genome sequences from nine emerging species of monkeyflower coupled with simulations of genomic divergence, we show that it is unlikely that background selection is the primary architect of these landscapes. Rather, differentiation landscapes have probably been shaped by adaptation and gene flow, which are processes that are central to our understanding of speciation. Therefore, our work has important implications for our understanding of what patterns of differentiation can tell us about the genetic basis of adaptation and speciation.
Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and simulation is usually the only way of obtaining ground-truth data to evaluate inferences. Because of this necessity, a large number of specialised simulation programs have been developed, each filling a particular niche, but with largely overlapping functionality and a substantial duplication of effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry and mutation simulations based on the succinct tree sequence data structure and tskit library. We summarise msprime's many features, and show that its performance is excellent, often many times faster and more memory efficient than specialised alternatives. These high-performance features have been thoroughly tested and validated, and built using a collaborative, open source development model, which reduces duplication of effort and promotes software quality via community engagement.
Spatial and seasonal variations in the environment are ubiquitous. Environmental heterogeneity can affect natural populations and lead to covariation between environment and allele frequencies. Drosophila melanogaster is known to harbor polymorphisms that change both with latitude and seasons. Identifying the role of selection in driving these changes is not trivial, because nonadaptive processes can cause similar patterns. Given the environment changes in similar ways across seasons and along the latitudinal gradient, one promising approach may be to look for parallelism between clinal and seasonal changes. Here, we test whether there is a genome‐wide correlation between clinal and seasonal changes, and whether the pattern is consistent with selection. Allele frequency estimates were obtained from pooled samples from seven different locations along the east coast of the United States, and across seasons within Pennsylvania. We show that there is a genome‐wide correlation between clinal and seasonal variations, which cannot be explained by linked selection alone. This pattern is stronger in genomic regions with higher functional content, consistent with natural selection. We derive a way to biologically interpret these correlations and show that around 3.7% of the common, autosomal variants could be under parallel seasonal and spatial selection. Our results highlight the contribution of natural selection in driving fluctuations in allele frequencies in natural fly populations and point to a shared genomic basis to climate adaptation that happens over space and time in D. melanogaster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.