In recent years different types of structural variants (SVs) have been discovered in the human genome and their functional impact has become increasingly clear. Inversions, however, are poorly characterized and more difficult to study, especially those mediated by inverted repeats or segmental duplications. Here, we describe the results of a simple and fast inverse PCR (iPCR) protocol for high-throughput genotyping of a wide variety of inversions using a small amount of DNA. In particular, we analyzed 22 inversions predicted in humans ranging from 5.1 kb to 226 kb and mediated by inverted repeat sequences of 1.6–24 kb. First, we validated 17 of the 22 inversions in a panel of nine HapMap individuals from different populations, and we genotyped them in 68 additional individuals of European origin, with correct genetic transmission in ∼12 mother-father-child trios. Global inversion minor allele frequency varied between 1% and 49% and inversion genotypes were consistent with Hardy-Weinberg equilibrium. By analyzing the nucleotide variation and the haplotypes in these regions, we found that only four inversions have linked tag-SNPs and that in many cases there are multiple shared SNPs between standard and inverted chromosomes, suggesting an unexpected high degree of inversion recurrence during human evolution. iPCR was also used to check 16 of these inversions in four chimpanzees and two gorillas, and 10 showed both orientations either within or between species, providing additional support for their multiple origin. Finally, we have identified several inversions that include genes in the inverted or breakpoint regions, and at least one disrupts a potential coding gene. Thus, these results represent a significant advance in our understanding of inversion polymorphism in human populations and challenge the common view of a single origin of inversions, with important implications for inversion analysis in SNP-based studies.
Inversions are one type of structural variants linked to phenotypic differences and adaptation in multiple organisms. However, there is still very little information about polymorphic inversions in the human genome due to the difficulty of their detection. Here, we develop a new high-throughput genotyping method based on probe hybridization and amplification, and we perform a complete study of 45 common human inversions of 0.1–415 kb. Most inversions promoted by homologous recombination occur recurrently in humans and great apes and they are not tagged by SNPs. Furthermore, there is an enrichment of inversions showing signatures of positive or balancing selection, diverse functional effects, such as gene disruption and gene-expression changes, or association with phenotypic traits. Therefore, our results indicate that the genome is more dynamic than previously thought and that human inversions have important functional and evolutionary consequences, making possible to determine for the first time their contribution to complex traits.
The view from southwestern Angola offers a new perspective on the populating history of southern Africa and the Bantu expansions by showing that social stratification and different subsistence patterns are not always indicative of remnant groups, but may reflect Bantu-internal variation and ethnogenesis.
The Bantu expansion, which started in West Central Africa around 5,000 BP, constitutes a major migratory movement involving the joint spread of peoples and languages across sub-Saharan Africa. Despite the rich linguistic and archaeological evidence available, the genetic relationships between different Bantu-speaking populations and the migratory routes they followed during various phases of the expansion remain poorly understood. Here, we analyze the genetic profiles of southwestern and southeastern Bantu-speaking peoples located at the edges of the Bantu expansion by generating genome-wide data for 200 individuals from 12 Mozambican and 3 Angolan populations using ∼1.9 million autosomal single nucleotide polymorphisms. Incorporating a wide range of available genetic data, our analyses confirm previous results favoring a “late split” between West and East Bantu speakers, following a joint passage through the rainforest. In addition, we find that Bantu speakers from eastern Africa display genetic substructure, with Mozambican populations forming a gradient of relatedness along a North–South cline stretching from the coastal border between Kenya and Tanzania to South Africa. This gradient is further associated with a southward increase in genetic homogeneity, and involved minimum admixture with resident populations. Together, our results provide the first genetic evidence in support of a rapid North–South dispersal of Bantu peoples along the Indian Ocean Coast, as inferred from the distribution and antiquity of Early Iron Age assemblages associated with the Kwale archaeological tradition.
Two Bolivian samples belonging to the two main Andean linguistic groups (Aymaras and Quechuas) were studied for mtDNA and Y-chromosome uniparental markers to evaluate sex-specific differences and give new insights into the demographic processes of the Andean region. mtDNA-coding polymorphisms, HVI-HVII control regions, 17 Y-STRs, and three SNPs were typed in two well-defined populations with adequate size samples. The two Bolivian samples showed more genetic differences for the mtDNA than for the Y-chromosome. For the mtDNA, 81% of Aymaras and 61% of Quechuas presented haplogroup B2. Native American Y-chromosomes were found in 97% of Aymaras (89% hg Q1a3a and 11% hg Q1a3*) and 78% of Quechuas (100% hg Q1a3a). Our data revealed high diversity values in the two populations, in agreement with other Andean studies. The comparisons with the available literature for both sets of markers indicated that the central Andean area is relatively homogeneous. For mtDNA, the Aymaras seemed to have been more isolated throughout time, maintaining their genetic characteristics, while the Quechuas have been more permeable to the incorporation of female foreigners and Peruvian influences. On the other hand, male mobility would have been widespread across the Andean region according to the homogeneity found in the area. Particular genetic characteristics presented by both samples support a past common origin of the Altiplano populations in the ancient Aymara territory, with independent, although related histories, with Peruvian (Quechuas) populations.
The growing catalogue of structural variants in humans often overlooks inversions as one of the most difficult types of variation to study, even though they affect phenotypic traits in diverse organisms. Here, we have analysed in detail 90 inversions predicted from the comparison of two independently assembled human genomes: the reference genome (NCBI36/HG18) and HuRef. Surprisingly, we found that two thirds of these predictions (62) represent errors either in assembly comparison or in one of the assemblies, including 27 misassembled regions in HG18. Next, we validated 22 of the remaining 28 potential polymorphic inversions using different PCR techniques and characterized their breakpoints and ancestral state. In addition, we determined experimentally the derived allele frequency in Europeans for 17 inversions (DAF = 0.01-0.80), as well as the distribution in 14 worldwide populations for 12 of them based on the 1000 Genomes Project data. Among the validated inversions, nine have inverted repeats (IRs) at their breakpoints, and two show nucleotide variation patterns consistent with a recurrent origin. Conversely, inversions without IRs have a unique origin and almost all of them show deletions or insertions at the breakpoints in the derived allele mediated by microhomology sequences, which highlights the importance of mechanisms like FoSTeS/MMBIR in the generation of complex rearrangements in the human genome. Finally, we found several inversions located within genes and at least one candidate to be positively selected in Africa. Thus, our study emphasizes the importance of careful analysis and validation of large-scale genomic predictions to extract reliable biological conclusions.
Despite the interest in characterizing genomic variation, the presence of large repeats at the breakpoints hinders the analysis of many structural variants. This is especially problematic for inversions, since there is typically no gain or loss of DNA. Here, we tested novel linkage-based droplet digital PCR (ddPCR) assays to study 20 inversions ranging from 3.1 to 742 kb flanked by inverted repeats (IRs) up to 134 kb long. Of those, we validated 13 inversions predicted by different genome-wide techniques. In addition, we obtained new experimental human population information across 95 African, European, and East Asian individuals for 16 inversions, including four already validated variants without high-throughput genotyping methods. Through comparison with previous data, independent replicates and both inversion breakpoints, we demonstrate that the technique is highly accurate and reproducible. Most studied inversions are widespread across continents, and their frequency is negatively correlated with genetic length. Moreover, all except two show clear signs of being recurrent, and we could better define the factors affecting recurrence levels and estimate the inversion rate across the genome. Finally, the generated genotypes have allowed us to check inversion functional effects, validating gene expression differences reported before for two inversions and finding new candidate associations. Therefore, the developed methodology makes it possible to screen these and other complex genomic variants quickly in a large number of samples for the first time, highlighting the importance of direct genotyping to assess their potential consequences and clinical implications.
BackgroundThe recent increase in human polymorphism data, together with the availability of genome sequences from several primate species, provides an unprecedented opportunity to investigate how natural selection has shaped human evolution.ResultsWe compared human branch-specific substitutions with variation data in the current human population to measure the impact of adaptive evolution on human protein coding genes. The use of single nucleotide polymorphisms (SNPs) with high derived allele frequencies (DAFs) minimized the influence of segregating slightly deleterious mutations and improved the estimation of the number of adaptive sites. Using DAF ≥ 60% we showed that the proportion of adaptive substitutions is 0.2% in the complete gene set. However, the percentage rose to 40% when we focused on genes that are specifically accelerated in the human branch with respect to the chimpanzee branch, or on genes that show signatures of adaptive selection at the codon level by the maximum likelihood based branch-site test. In general, neural genes are enriched in positive selection signatures. Genes with multiple lines of evidence of positive selection include taxilin beta, which is involved in motor nerve regeneration and syntabulin, and is required for the formation of new presynaptic boutons.ConclusionsWe combined several methods to detect adaptive evolution in human coding sequences at a genome-wide level. The use of variation data, in addition to sequence divergence information, uncovered previously undetected positive selection signatures in neural genes.Electronic supplementary materialThe online version of this article (doi:10.1186/1471-2164-15-599) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.