Previous genome-wide scans of positive natural selection in humans have identified a number of non-neutrally evolving genes that play important roles in skin pigmentation, metabolism, or immune function. Recent studies have also shown that a genome-wide pattern of local adaptation can be detected by identifying correlations between patterns of allele frequencies and environmental variables. Despite these observations, the degree to which natural selection is primarily driven by adaptation to local environments, and the role of pathogens or other ecological factors as selective agents, is still under debate. To address this issue, we correlated the spatial allele frequency distribution of a large sample of SNPs from 55 distinct human populations to a set of environmental factors that describe local geographical features such as climate, diet regimes, and pathogen loads. In concordance with previous studies, we detected a significant enrichment of genic SNPs, and particularly non-synonymous SNPs associated with local adaptation. Furthermore, we show that the diversity of the local pathogenic environment is the predominant driver of local adaptation, and that climate, at least as measured here, only plays a relatively minor role. While background demography by far makes the strongest contribution in explaining the genetic variance among populations, we detected about 100 genes which show an unexpectedly strong correlation between allele frequencies and pathogenic environment, after correcting for demography. Conversely, for diet regimes and climatic conditions, no genes show a similar correlation between the environmental factor and allele frequencies. This result is validated using low-coverage sequencing data for multiple populations. Among the loci targeted by pathogen-driven selection, we found an enrichment of genes associated to autoimmune diseases, such as celiac disease, type 1 diabetes, and multiples sclerosis, which lends credence to the hypothesis that some susceptibility alleles for autoimmune diseases may be maintained in human population due to past selective processes.
In the age of next-generation sequencing, the availability of increasing amounts and improved quality of data at decreasing cost ought to allow for a better understanding of how natural selection is shaping the genome than ever before. However, alternative forces, such as demography and background selection (BGS), obscure the footprints of positive selection that we would like to identify. In this review, we illustrate recent developments in this area, and outline a roadmap for improved selection inference. We argue (i) that the development and obligatory use of advanced simulation tools is necessary for improved identification of selected loci, (ii) that genomic information from multiple time points will enhance the power of inference, and (iii) that results from experimental evolution should be utilized to better inform population genomic studies.
In the age of next-generation sequencing, the availability of increasing amounts and quality of data at decreasing cost ought to allow for a better understanding of how natural selection is shaping the genome than ever before. Yet, alternative forces such as demography and background selection obscure the footprints of positive selection that we would like to identify. Here, we illustrate recent developments in this area, and outline a roadmap for improved selection inference. We argue (1) that the development and obligatory use of advanced simulation tools is necessary for improved identification of selected loci, (2) that genomic information from multipletime points will enhance the power of inference, and (3) that results from experimental evolution should be utilized to better inform population-genomic studies.. CC-BY-NC-ND 4.0 International license not peer-reviewed) is the author/funder. It is made available under aThe copyright holder for this preprint (which was . http://dx.doi.org/10.1101/009654 doi: bioRxiv preprint first posted online Sep. 25, 2014; 3 Identification of beneficial mutations in the genome: an ongoing questThe identification of genetic variants that confer an advantage to an organism, and that have spread by forces other than chance, remains as an important question in evolutionary biology. Success in this regard will have broad implications not only for informing our view of the process of evolution itself, but also for evolutionary applications ranging from clinical to ecological. Despite the tremendous quantity of polymorphism data now at our fingertips, which in principle ought to allow for a better characterization of such adaptive genetic variants, it remains a challenge to unambiguously identify alleles under selection. This is primarily owing to the difficulty in disentangling the effects of positive selection from those of other factors that shape the composition of genomes, including both demography as well as other selective processes.Approaches to identify positively selected variants from genomic data can be broadly divided into two categories: those that make use of within-population polymorphism data, and those that make use of between-population/species data. While each approach has its respective merits, we here focus on recent developments in population-genetic inference from polymorphism data in both natural and experimental settings (see [1,2] for more general reviews, and [3,4] for recent and specific literature on divergence-based selection inference). For population-genetic inference from single-time point polymorphism data (as is most commonly the case) this includes not only sophisticated statistical methods, but also simulation programs that enable us to model expected genomic signatures under a wide variety of possible scenarios. Alternatively, data from multiple-time points -such as those recently afforded by ancient genomic data as well as many clinical and experimental datasetscan be used to greatly improve inference by catching a selective sweep...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.