It is commonly thought that human genetic diversity in non-African populations was shaped primarily by an out-of-Africa dispersal 50–100 thousand yr ago (kya). Here, we present a study of 456 geographically diverse high-coverage Y chromosome sequences, including 299 newly reported samples. Applying ancient DNA calibration, we date the Y-chromosomal most recent common ancestor (MRCA) in Africa at 254 (95% CI 192–307) kya and detect a cluster of major non-African founder haplogroups in a narrow time interval at 47–52 kya, consistent with a rapid initial colonization model of Eurasia and Oceania after the out-of-Africa bottleneck. In contrast to demographic reconstructions based on mtDNA, we infer a second strong bottleneck in Y-chromosome lineages dating to the last 10 ky. We hypothesize that this bottleneck is caused by cultural changes affecting variance of reproductive success among males.
Southern and eastern African populations that speak non-Bantu languages with click consonants are known to harbour some of the most ancient genetic lineages in humans, but their relationships are poorly understood. Here, we report data from 23 populations analysed at over half a million single-nucleotide polymorphisms, using a genome-wide array designed for studying human history. The southern African Khoisan fall into two genetic groups, loosely corresponding to the northwestern and southeastern Kalahari, which we show separated within the last 30,000 years. We find that all individuals derive at least a few percent of their genomes from admixture with non-Khoisan populations that began ∼1,200 years ago. In addition, the East African Hadza and Sandawe derive a fraction of their ancestry from admixture with a population related to the Khoisan, supporting the hypothesis of an ancient link between southern and eastern Africa.
Summary To reconstruct modern human evolutionary history and identify loci that have shaped hunter-gatherer adaptation, we sequenced the whole-genomes of five individuals in each of three different hunter-gatherer populations at > 60x coverage: Pygmies from Cameroon and Khoesan-speaking Hadza and Sandawe from Tanzania. We identify 13.4 million variants, substantially increasing the set of known human variation. We found evidence of archaic introgression in all three populations and the distribution of time to most recent common ancestors from these regions is similar to that observed for introgressed regions in Europeans. Additionally, we identify numerous loci that harbor signatures of local adaptation, including genes involved in immunity, metabolism, olfactory and taste perception, reproduction, and wound healing. Within the Pygmy population, we identify multiple highly differentiated loci that play a role in growth and anterior pituitary function and are associated with height.
Summary Whole genome sequencing and SNP genotyping arrays can paint strikingly different pictures of demographic history and natural selection. This is because genotyping arrays contain biased sets of pre-ascertained SNPs. In this short review, we use comparisons between high-coverage whole genome sequences of African hunter-gatherers and data from genotyping arrays to highlight how SNP ascertainment bias distorts population genetic inferences. Sample sizes and the populations in which SNPs are discovered affect the characteristics of observed variants. We find that SNPs on genotyping arrays tend to be older and present in multiple populations. In addition, genotyping arrays cause allele frequency distributions to be shifted towards intermediate frequency alleles, and estimates of linkage disequilibrium are modified. Since population genetic analyses depend on allele frequencies it is imperative that researchers are aware of the effects of SNP ascertainment bias. With this in mind we describe multiple ways to correct for SNP ascertainment bias.
BackgroundAccurate assessment of health disparities requires unbiased knowledge of genetic risks in different populations. Unfortunately, most genome-wide association studies use genotyping arrays and European samples. Here, we integrate whole genome sequence data from global populations, results from thousands of genome-wide association studies (GWAS), and extensive computer simulations to identify how genetic disease risks can be misestimated.ResultsIn contrast to null expectations, we find that risk allele frequencies at known disease loci are significantly different for African populations compared to other continents. Strikingly, ancestral risk alleles are found at 9.51% higher frequency in Africa, and derived risk alleles are found at 5.40% lower frequency in Africa. By simulating GWAS with different study populations, we find that non-African cohorts yield disease associations that have biased allele frequencies and that African cohorts yield disease associations that are relatively free of bias. We also find empirical evidence that genotyping arrays and SNP ascertainment bias contribute to continental differences in risk allele frequencies. Because of these causes, polygenic risk scores can be grossly misestimated for individuals of African descent. Importantly, continental differences in risk allele frequencies are only moderately reduced if GWAS use whole genome sequences and hundreds of thousands of cases and controls. Finally, comparisons between uncorrected and corrected genetic risk scores reveal the benefits of considering whether risk alleles are ancestral or derived.ConclusionsOur results imply that caution must be taken when extrapolating GWAS results from one population to predict disease risks in another population.Electronic supplementary materialThe online version of this article (10.1186/s13059-018-1561-7) contains supplementary material, which is available to authorized users.
Phylogeny estimation is difficult for closely related populations and species, especially if they have been exchanging genes. We present a hierarchical Bayesian, Markov-chain Monte Carlo method with a state space that includes all possible phylogenies in a full Isolation-with-Migration model framework. The method is based on a new type of genealogy augmentation called a “hidden genealogy” that enables efficient updating of the phylogeny. This is the first likelihood-based method to fully incorporate directional gene flow and genetic drift for estimation of a species or population phylogeny. Application to human hunter-gatherer populations from Africa revealed a clear phylogenetic history, with strong support for gene exchange with an unsampled ghost population, and relatively ancient divergence between a ghost population and modern human populations, consistent with human/archaic divergence. In contrast, a study of five chimpanzee populations reveals a clear phylogeny with several pairs of populations having exchanged DNA, but does not support a history with an unsampled ghost population.
Comparisons of whole-genome sequences from ancient and contemporary samples have pointed to several instances of archaic admixture through interbreeding between the ancestors of modern non-Africans and now extinct hominids such as Neanderthals and Denisovans. One implication of these findings is that some adaptive features in contemporary humans may have entered the population via gene flow with archaic forms in Eurasia. Within Africa, fossil evidence suggests that anatomically modern humans (AMH) and various archaic forms coexisted for much of the last 200,000 yr; however, the absence of ancient DNA in Africa has limited our ability to make a direct comparison between archaic and modern human genomes. Here, we use statistical inference based on high coverage whole-genome data (greater than 60×) from contemporary African Pygmy hunter-gatherers as an alternative means to study the evolutionary history of the genus Homo. Using whole-genome simulations that consider demographic histories that include both isolation and gene flow with neighboring farming populations, our inference method rejects the hypothesis that the ancestors of AMH were genetically isolated in Africa, thus providing model-based whole genome-level evidence of African archaic admixture. Our inferences also suggest a complex human evolutionary history in Africa, which involves at least a single admixture event from an unknown archaic population into the ancestors of AMH, likely within the last 30,000 yr.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.