Since the initial description of the genomic patterns expected under models of positive selection acting on standing genetic variation and on multiple beneficial mutations—so-called soft selective sweeps—researchers have sought to identify these patterns in natural population data. Indeed, over the past two years, large-scale data analyses have argued that soft sweeps are pervasive across organisms of very different effective population size and mutation rate—humans, Drosophila, and HIV. Yet, others have evaluated the relevance of these models to natural populations, as well as the identifiability of the models relative to other known population-level processes, arguing that soft sweeps are likely to be rare. Here, we look to reconcile these opposing results by carefully evaluating three recent studies and their underlying methodologies. Using population genetic theory, as well as extensive simulation, we find that all three examples are prone to extremely high false-positive rates, incorrectly identifying soft sweeps under both hard sweep and neutral models. Furthermore, we demonstrate that well-fit demographic histories combined with rare hard sweeps serve as the more parsimonious explanation. These findings represent a necessary response to the growing tendency of invoking parameter-heavy, assumption-laden models of pervasive positive selection, and neglecting best practices regarding the construction of proper demographic null models.
Adaptive evolution progresses as a series of steps toward a multidimensional phenotypic optimum, and organismal or environmental complexity determines the number of phenotypic dimensions, or traits, under selection. Populations evolving in complex environments may experience costs of complexity such that improvement in one or more traits is impeded by selection on others. We compared the fitness effects of the first fixed mutations for populations of single-stranded DNA bacteriophage evolving under simple selection for growth rate to those of populations evolving under more complex selection for growth rate as well as capsid stability. We detected a cost of complexity manifested as a smaller growth rate improvement for mutations fixed under complex conditions. We found that, despite imposing a cost for growth rate improvement, strong complex selection resulted in the greatest overall fitness improvement, even for single mutations. Under weaker secondary selective pressures, tradeoffs between growth rate and stability were pervasive, but strong selection on the secondary trait resulted largely in mutations beneficial to both traits. Strength of selection therefore determined the nature of pleiotropy governing observed trait evolution, and strong positive selection forced populations to find mutations that improved multiple traits, thereby overriding costs incurred as a result of a more complex selective environment. The costs of complexity, however, remained substantial when considering the effects on a single trait in the context of selection on multiple traits.
Convergent evolution has been demonstrated across all levels of biological organization, from parallel nucleotide substitutions to convergent evolution of complex phenotypes, but whether instances of convergence are the result of selection repeatedly finding the same optimal solution to a recurring problem or are the product of mutational biases remains unsettled. We generated 20 replicate lineages allowed to fix a single mutation from each of four bacteriophage genotypes under identical selective regimes to test for parallel changes within and across genotypes at the levels of mutational effect distributions and gene, protein, amino acid, and nucleotide changes. All four genotypes shared a distribution of beneficial mutational effects best approximated by a distribution with a finite upper bound. Parallel adaptation was high at the protein, gene, amino acid, and nucleotide levels, both within and among phage genotypes, with the most common first-step mutation in each background fixing on an average in 7 of 20 replicates and half of the substitutions in two of the four genotypes occurring at shared sites. Remarkably, the mutation of largest beneficial effect that fixed for each genotype was never the most common, as would be expected if parallelism were driven by selection. In fact, the mutation of smallest benefit for each genotype fixed in a total of 7 of 80 lineages, equally as often as the mutation of largest benefit, leading us to conclude that adaptation was largely mutation-driven, such that mutational biases led to frequent parallel fixation of mutations of suboptimal effect.
The recent increase in time-series population genomic data from experimental, natural, and ancient populations has been accompanied by a promising growth in methodologies for inferring demographic and selective parameters from such data. However, these methods have largely presumed that the populations of interest are well-described by the Kingman coalescent. In reality, many groups of organisms, including viruses, marine organisms, and some plants, protists, and fungi, typified by high variance in progeny number, may be best characterized by multiple-merger coalescent models. Estimation of population genetic parameters under Wright-Fisher assumptions for these organisms may thus be prone to serious mis-inference. We propose a novel method for the joint inference of demography and selection under the C-coalescent model, termed Multiple-Merger Coalescent Approximate Bayesian Computation, or MMC-ABC. We first demonstrate mis-inference under the Kingman, and then exhibit the superior performance of MMC-ABC under conditions of skewed offspring distributions. In order to highlight the utility of this approach, we reanalyzed previously published drug-selection lines of influenza A virus. We jointly inferred the extent of progeny-skew inherent to viral replication and identified putative drug-resistance mutations.
Since the initial description of the genomic patterns expected under models of positive selection acting on standing genetic variation and on multiple beneficial mutations-so-called soft selective sweeps-researchers have sought to identify these patterns in natural population data. Indeed, over the past two years, large-scale data analyses have argued that soft sweeps are pervasive across organisms of very different effective population size and mutation rate-humans, Drosophila, and HIV. Yet, others have evaluated the relevance of these models to natural populations, as well as the identifiability of the models relative to other known population-level processes, arguing that soft sweeps are likely to be rare. Here, we look to reconcile these opposing results by carefully evaluating three recent studies and their underlying methodologies. Using population genetic theory, as well as extensive simulation, we find that all three examples are prone to extremely high false-positive rates, incorrectly identifying soft sweeps under both hard sweep and neutral models. Furthermore, we demonstrate that well-fit demographic histories combined with rare hard sweeps serve as the more parsimonious explanation. These findings represent a necessary response to the growing tendency of invoking parameter-heavy, assumption-laden models of pervasive positive selection, and neglecting best practices regarding the construction of proper demographic null models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.