Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27–29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30–42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10–20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits.
Since the divergence of humans and chimpanzees about 5 million years ago, these species have undergone a remarkable evolution with drastic divergence in anatomy and cognitive abilities. At the molecular level, despite the small overall magnitude of DNA sequence divergence, we might expect such evolutionary changes to leave a noticeable signature throughout the genome. We here compare 13,731 annotated genes from humans to their chimpanzee orthologs to identify genes that show evidence of positive selection. Many of the genes that present a signature of positive selection tend to be involved in sensory perception or immune defenses. However, the group of genes that show the strongest evidence for positive selection also includes a surprising number of genes involved in tumor suppression and apoptosis, and of genes involved in spermatogenesis. We hypothesize that positive selection in some of these genes may be driven by genomic conflict due to apoptosis during spermatogenesis. Genes with maximal expression in the brain show little or no evidence for positive selection, while genes with maximal expression in the testis tend to be enriched with positively selected genes. Genes on the X chromosome also tend to show an elevated tendency for positive selection. We also present polymorphism data from 20 Caucasian Americans and 19 African Americans for the 50 annotated genes showing the strongest evidence for positive selection. The polymorphism analysis further supports the presence of positive selection in these genes by showing an excess of high-frequency derived nonsynonymous mutations.
We performed a multitiered, case-control association study of psoriasis in three independent sample sets of white North American individuals (1,446 cases and 1,432 controls) with 25,215 genecentric single-nucleotide polymorphisms (SNPs) and found a highly significant association with an IL12B 3'-untranslated-region SNP (rs3212227), confirming the results of a small Japanese study. This SNP was significant in all three sample sets (odds ratio [OR](common) 0.64, combined P [Pcomb]=7.85x10(-10)). A Monte Carlo simulation to address multiple testing suggests that this association is not a type I error. The coding regions of IL12B were resequenced in 96 individuals with psoriasis, and 30 additional IL12B-region SNPs were genotyped. Haplotypes were estimated, and genotype-conditioned analyses identified a second risk allele (rs6887695) located approximately 60 kb upstream of the IL12B coding region that exhibited association with psoriasis after adjustment for rs3212227. Together, these two SNPs mark a common IL12B risk haplotype (OR(common) 1.40, Pcomb=8.11x10(-9)) and a less frequent protective haplotype (OR(common) 0.58, Pcomb=5.65x10(-12)), which were statistically significant in all three studies. Since IL12B encodes the common IL-12p40 subunit of IL-12 and IL-23, we individually genotyped 17 SNPs in the genes encoding the other chains of these cytokines (IL12A and IL23A) and their receptors (IL12RB1, IL12RB2, and IL23R). Haplotype analyses identified two IL23R missense SNPs that together mark a common psoriasis-associated haplotype in all three studies (OR(common) 1.44, Pcomb=3.13x10(-6)). Individuals homozygous for both the IL12B and the IL23R predisposing haplotypes have an increased risk of disease (OR(common) 1.66, Pcomb=1.33x10(-8)). These data, and the previous observation that administration of an antibody specific for the IL-12p40 subunit to patients with psoriasis is highly efficacious, suggest that these genes play a fundamental role in psoriasis pathogenesis.
Comparisons of DNA polymorphism within species to divergence between species enables the discovery of molecular adaptation in evolutionarily constrained genes as well as the differentiation of weak from strong purifying selection. The extent to which weak negative and positive darwinian selection have driven the molecular evolution of different species varies greatly, with some species, such as Drosophila melanogaster, showing strong evidence of pervasive positive selection, and others, such as the selfing weed Arabidopsis thaliana, showing an excess of deleterious variation within local populations. Here we contrast patterns of coding sequence polymorphism identified by direct sequencing of 39 humans for over 11,000 genes to divergence between humans and chimpanzees, and find strong evidence that natural selection has shaped the recent molecular evolution of our species. Our analysis discovered 304 (9.0%) out of 3,377 potentially informative loci showing evidence of rapid amino acid evolution. Furthermore, 813 (13.5%) out of 6,033 potentially informative loci show a paucity of amino acid differences between humans and chimpanzees, indicating weak negative selection and/or balancing selection operating on mutations at these loci. We find that the distribution of negatively and positively selected genes varies greatly among biological processes and molecular functions, and that some classes, such as transcription factors, show an excess of rapidly evolving genes, whereas others, such as cytoskeletal proteins, show an excess of genes with extensive amino acid polymorphism within humans and yet little amino acid divergence between humans and chimpanzees.
Even though human and chimpanzee gene sequences are nearly 99% identical, sequence comparisons can nevertheless be highly informative in identifying biologically important changes that have occurred since our ancestral lineages diverged. We analyzed alignments of 7645 chimpanzee gene sequences to their human and mouse orthologs. These three-species sequence alignments allowed us to identify genes undergoing natural selection along the human and chimp lineage by fitting models that include parameters specifying rates of synonymous and nonsynonymous nucleotide substitution. This evolutionary approach revealed an informative set of genes with significantly different patterns of substitution on the human lineage compared with the chimpanzee and mouse lineages. Partitions of genes into inferred biological classes identified accelerated evolution in several functional classes, including olfaction and nuclear transport. In addition to suggesting adaptive physiological differences between chimps and humans, human-accelerated genes are significantly more likely to underlie major known Mendelian disorders.
Quantifying the number of deleterious mutations per diploid human genome is of crucial concern to both evolutionary and medical geneticists [1][2][3] . Here we combine genome-wide polymorphism data from PCR-based exon resequencing, comparative genomic data across mammalian species, and protein structure predictions to estimate the number of functionally consequential singlenucleotide polymorphisms (SNPs) carried by each of 15 African American (AA) and 20 European American (EA) individuals. We find that AAs show significantly higher levels of nucleotide heterozygosity than do EAs for all categories of functional SNPs considered, including synonymous, non-synonymous, predicted 'benign', predicted 'possibly damaging' and predicted 'probably damaging' SNPs. This result is wholly consistent with previous work showing higher overall levels of nucleotide variation in African populations than in Europeans 4 . EA individuals, in contrast, have significantly more genotypes homozygous for the derived allele at synonymous and non-synonymous SNPs and for the damaging allele at 'probably damaging' SNPs than AAs do. For SNPs segregating only in one population or the other, the proportion of non-synonymous SNPs is significantly higher in the EA sample (55.4%) than in the AA sample (47.0%; P , 2.3 3 10 237). We observe a similar proportional excess of SNPs that are inferred to be 'probably damaging' (15.9% in EA; 12.1% in AA; P , 3.3 3 10 211). Using extensive simulations, we show that this excess proportion of segregating damaging alleles in Europeans is probably a consequence of a bottleneck that Europeans experienced at about the time of the migration out of Africa.Current estimates of the number of deleterious mutations per diploid human genome vary by several orders of magnitude. Using a correlation in inbreeding rates within consanguineous marriages and mortality, Morton et al.5 estimated that each of us carries three to five lethal equivalents (that is, an allele or combination of alleles that if made homozygous would be lethal), whereas Kondrashov 6 has predicted that the number may be as high as 100 lethal equivalents. Comparative genomic methods indicate that about 38% of aminoacid-changing polymorphisms are deleterious, with 1.6 new deleterious mutations arising per individual per generation 7 , whereas studies based on segregating polymorphisms estimate that each person carries between 500 and 1,200 deleterious mutations 3,8 . It is difficult to reconcile these estimates because each study used different methods and data. Furthermore, studies that used DNA sequences included data from only several hundred genes. Hence there is a crucial need for an unbiased genome-wide estimate of the number of damaging mutations carried by individuals in different populations.We quantify the number of damaging mutations per diploid human genome by combining the Applera genome-wide survey of SNPs found by the resequencing of 20 EAs and 15 AAs 9 with comparative genomic data including the PanTro2 build of the chimpanzee genome and pre...
Coccidioides immitis, cause of a recent epidemic of "Valley fever" in California, is typical of many eukaryotic microbes in that mating and meiosis have yet to be reported, but it is not clear whether sex is truly absent or just cryptic. To find out, we have undertaken a population genetic study using PCR amplification, screening for single-strand conformation polymorphisms, and direct DNA sequencing to find molecular markers with nucleotide-level resolution. Both population genetic and phylogenetic analyses indicate that C. immitis is almost completely recombining. To our knowledge, this study is the first to find molecular evidence for recombination in a fungus for which no sexual stage has yet been described. These results motivate a directed search for mating and meiosis and illustrate the utility of single-strand conformation polymorphism and sequencing with arbitrary primer pairs in molecular population genetics.Unlike most plants and animals, the vast majority of eukaryotic microorganisms can reproduce asexually, and this together with their small size can make it very difficult to determine the relative importance of sexual reproduction in nature by direct observation alone. Rather, molecular markers must be used to test for the clonal population structure expected if sex is absent and the recombinant genotypes expected if sex is present (1-3). It is important in such studies that marker identity reflect common descent and that identities due to convergences, parallelisms, and reversals be minimized; in this respect, the most informative markers are DNA sequences (4,5). However, most studies of human pathogens have used allozymes or random amplified polymorphic DNAs (RAPDs) (1-3, 6). Neither approach is completely satisfactory, as allozyme patterns can be misleading because of natural selection (7), and RAPD patterns can be difficult both to repeat and to interpret in terms of Mendelian loci (8)(9)(10) (coccidioidomycosis) in California, with case reports 10 times more frequent than normal (13,14). Like some 20% of fungi (15), no sexual stage has ever been reported in C. immitis (16). Unfortunately, the inability to cross strains has hindered basic and applied research on this species.The publication costs of this article were defrayed in part by page charge payment. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. §1734 solely to indicate this fact. MATERIALS AND METHODSFor our study we have analyzed 30 clinical isolates from 25 patients at a single hospital in Tucson, Arizona. Three patients contributed multiple samples, collected up to 3 weeks, 16 months, and 8 years apart, respectively (Table 1). ¶ All isolates were collected in [1979][1980][1981][1982][1983][1984][1985][1986][1987][1988][1989][1990], before the epidemic.Our strategy for finding molecular markers begins with low-stringency PCR amplification from genomic DNA using arbitrary primers (-20-mers) in various pairwise combinations (11). Genomic DNA was isolated following heat treatment to k...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.