The peopling of the Americas has been the subject of extensive genetic, archaeological and linguistic research; however, central questions remain unresolved1–5. One contentious issue is whether the settlement occurred via a single6–8 or multiple streams of migration from Siberia9–15. The pattern of dispersals within the Americas is also poorly understood. To address these questions at higher resolution than was previously possible, we assembled data from 52 Native American and 17 Siberian groups genotyped at 364,470 single nucleotide polymorphisms. We show that Native Americans descend from at least three streams of Asian gene flow. Most descend entirely from a single ancestral population that we call “First American”. However, speakers of Eskimo-Aleut languages from the Arctic inherit almost half their ancestry from a second stream of Asian gene flow, and the Na-Dene-speaking Chipewyan from Canada inherit roughly one-tenth of their ancestry from a third stream. We show that the initial peopling followed a southward expansion facilitated by the coast, with sequential population splits and little gene flow after divergence, especially in South America. A major exception is in Chibchan-speakers on both sides of the Panama Isthmus, who have ancestry from both North and South America.
Loci involved in local adaptation can potentially be identified by an unusual correlation between allele frequencies and important ecological variables or by extreme allele frequency differences between geographic regions. However, such comparisons are complicated by differences in sample sizes and the neutral correlation of allele frequencies across populations due to shared history and gene flow. To overcome these difficulties, we have developed a Bayesian method that estimates the empirical pattern of covariance in allele frequencies between populations from a set of markers and then uses this as a null model for a test at individual SNPs. In our model the sample frequencies of an allele across populations are drawn from a set of underlying population frequencies; a transform of these population frequencies is assumed to follow a multivariate normal distribution. We first estimate the covariance matrix of this multivariate normal across loci using a Monte Carlo Markov chain. At each SNP, we then provide a measure of the support, a Bayes factor, for a model where an environmental variable has a linear effect on the transformed allele frequencies compared to a model given by the covariance matrix alone. This test is shown through power simulations to outperform existing correlation tests. We also demonstrate that our method can be used to identify SNPs with unusually large allele frequency differentiation and offers a powerful alternative to tests based on pairwise or global F ST . Software is available at http://www. eve.ucdavis.edu/gmcoop/.
Evolutionary pressures due to variation in climate play an important role in shaping phenotypic variation among and within species and have been shown to influence variation in phenotypes such as body shape and size among humans. Genes involved in energy metabolism are likely to be central to heat and cold tolerance. To test the hypothesis that climate shaped variation in metabolism genes in humans, we used a bioinformatics approach based on network theory to select 82 candidate genes for common metabolic disorders. We genotyped 873 tag SNPs in these genes in 54 worldwide populations (including the 52 in the Human Genome Diversity Project panel) and found correlations with climate variables using rank correlation analysis and a newly developed method termed Bayesian geographic analysis. In addition, we genotyped 210 carefully matched control SNPs to provide an empirical null distribution for spatial patterns of allele frequency due to population history alone. For nearly all climate variables, we found an excess of genic SNPs in the tail of the distributions of the test statistics compared to the control SNPs, implying that metabolic genes as a group show signals of spatially varying selection. Among our strongest signals were several SNPs (e.g., LEPR R109K, FABP2 A54T) that had previously been associated with phenotypes directly related to cold tolerance. Since variation in climate may be correlated with other aspects of environmental variation, it is possible that some of the signals that we detected reflect selective pressures other than climate. Nevertheless, our results are consistent with the idea that climate has been an important selective pressure acting on candidate genes for common metabolic disorders.
Humans inhabit a remarkably diverse range of environments, and adaptation through natural selection has likely played a central role in the capacity to survive and thrive in extreme climates. Unlike numerous studies that used only population genetic data to search for evidence of selection, here we scan the human genome for selection signals by identifying the SNPs with the strongest correlations between allele frequencies and climate across 61 worldwide populations. We find a striking enrichment of genic and nonsynonymous SNPs relative to non-genic SNPs among those that are strongly correlated with these climate variables. Among the most extreme signals, several overlap with those from GWAS, including SNPs associated with pigmentation and autoimmune diseases. Further, we find an enrichment of strong signals in gene sets related to UV radiation, infection and immunity, and cancer. Our results imply that adaptations to climate shaped the spatial distribution of variation in humans.
Members of the cytochrome P450 3A subfamily catalyze the metabolism of endogenous substrates, environmental carcinogens, and clinically important exogenous compounds, such as prescription drugs and therapeutic agents. In particular, the CYP3A4 and CYP3A5 genes play an especially important role in pharmacogenetics, since they metabolize >50% of the drugs on the market. However, known genetic variants at these two loci are not sufficient to account for the observed phenotypic variability in drug response. We used a comparative genomics approach to identify conserved coding and noncoding regions at these genes and resequenced them in three ethnically diverse human populations. We show that remarkable interpopulation differences exist with regard to frequency spectrum and haplotype structure. The non-African samples are characterized by a marked excess of rare variants and the presence of a homogeneous group of long-range haplotypes at high frequency. The CYP3A5*1/*3 polymorphism, which is likely to influence salt and water retention and risk for salt-sensitive hypertension, was genotyped in >1,000 individuals from 52 worldwide population samples. The results reveal an unusual geographic pattern whereby the CYP3A5*3 frequency shows extreme variation across human populations and is significantly correlated with distance from the equator. Furthermore, we show that an unlinked variant, AGT M235T, previously implicated in hypertension and pre-eclampsia, exhibits a similar geographic distribution and is significantly correlated in frequency with CYP3A5*1/*3. Taken together, these results suggest that variants that influence salt homeostasis were the targets of a shared selective pressure that resulted from an environmental variable correlated with latitude.
Human populations use a variety of subsistence strategies to exploit an exceptionally broad range of ecoregions and dietary components. These aspects of human environments have changed dramatically during human evolution, giving rise to new selective pressures. To understand the genetic basis of human adaptations, we combine population genetics data with ecological information to detect variants that increased in frequency in response to new selective pressures. Our approach detects SNPs that show concordant differences in allele frequencies across populations with respect to specific aspects of the environment. Genic and especially nonsynonymous SNPs are overrepresented among those most strongly correlated with environmental variables. This provides genome-wide evidence for selection due to changes in ecoregion, diet, and subsistence. We find particularly strong signals associated with polar ecoregions, with foraging, and with a diet rich in roots and tubers. Interestingly, several of the strongest signals overlap with those implicated in energy metabolism phenotypes from genome-wide association studies, including SNPs influencing glucose levels and susceptibility to type 2 diabetes. Furthermore, several pathways, including those of starch and sucrose metabolism, are enriched for strong signals of adaptations to a diet rich in roots and tubers, whereas signals associated with polar ecoregions are overrepresented in genes associated with energy metabolism pathways.cold tolerance | foraging | genome-wide association studies | roots and tubers | soft sweeps
Admixture is recognized as a widespread feature of human populations, renewing interest in the possibility that genetic exchange can facilitate adaptations to new environments. Studies of Tibetans revealed candidates for high-altitude adaptations in the EGLN1 and EPAS1 genes, associated with lower hemoglobin concentration. However, the history of these variants or that of Tibetans remains poorly understood. Here, we analyze genotype data for the Nepalese Sherpa, and find that Tibetans are a mixture of ancestral populations related to the Sherpa and Han Chinese. EGLN1 and EPAS1 genes show a striking enrichment of high-altitude ancestry in the Tibetan genome, indicating that migrants from low altitude acquired adaptive alleles from the highlanders. Accordingly, the Sherpa and Tibetans share adaptive hemoglobin traits. This admixture-mediated adaptation shares important features with adaptive introgression. Therefore, we identify a novel mechanism, beyond selection on new mutations or on standing variation, through which populations can adapt to local environments.
Although hypoxia is a major stress on physiological processes, several human populations have survived for millennia at high altitudes, suggesting that they have adapted to hypoxic conditions. This hypothesis was recently corroborated by studies of Tibetan highlanders, which showed that polymorphisms in candidate genes show signatures of natural selection as well as well-replicated association signals for variation in hemoglobin levels. We extended genomic analysis to two Ethiopian ethnic groups: Amhara and Oromo. For each ethnic group, we sampled low and high altitude residents, thus allowing genetic and phenotypic comparisons across altitudes and across ethnic groups. Genome-wide SNP genotype data were collected in these samples by using Illumina arrays. We find that variants associated with hemoglobin variation among Tibetans or other variants at the same loci do not influence the trait in Ethiopians. However, in the Amhara, SNP rs10803083 is associated with hemoglobin levels at genome-wide levels of significance. No significant genotype association was observed for oxygen saturation levels in either ethnic group. Approaches based on allele frequency divergence did not detect outliers in candidate hypoxia genes, but the most differentiated variants between high- and lowlanders have a clear role in pathogen defense. Interestingly, a significant excess of allele frequency divergence was consistently detected for genes involved in cell cycle control and DNA damage and repair, thus pointing to new pathways for high altitude adaptations. Finally, a comparison of CpG methylation levels between high- and lowlanders found several significant signals at individual genes in the Oromo.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.