We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.
With the advent of dense maps of human genetic variation, it is now possible to detect positive natural selection across the human genome. Here we report an analysis of over 3 million polymorphisms from the International HapMap Project Phase 2 (HapMap2)1. We used 'longrange haplotype' methods, which were developed to identify alleles segregating in a population that have undergone recent selection2, and we also developed new methods that are based on cross-population comparisons to discover alleles that have swept to near-fixation within a population. The analysis reveals more than 300 strong candidate regions. Focusing on the strongest 22 regions, we develop a heuristic for scrutinizing these regions to identify candidate targets of selection. In a complementary analysis, we identify 26 non-synonymous, coding, single nucleotide polymorphisms showing regional evidence of positive selection. Examination of these candidates highlights three cases in which two genes in a common biological process have apparently undergone positive selection in the same population: LARGE and DMD, both related to infection by the Lassa virus3, in West Africa; SLC24A5 and SLC45A2, both involved in skin pigmentation4,5, in Europe; and EDAR and EDA2R, both involved in development of hair follicles6, in Asia. ©2007 Nature Publishing GroupCorrespondence and requests for materials should be addressed to P.C.S. (pardis@broad.mit.edu).. * These authors contributed equally to this work. † Lists of participants and affiliations appear at the end of the paper. Author Contributions P.C.S., P.V., B.F. and E.S.L. initiated the project. P.V., B.F. and P.C.S. developed key software. P.C.S., P.V., B.F., S.F.S., J.L., E.H., C.C., X.X., E.B., S.A.McC. and R.G. performed analysis. P.C.S., E.B. and E.H. performed experiments. P.C.S., E.S.L., P.V. and S.F.S. wrote the manuscript.Full Methods and any associated references are available in the online version of the paper at www.nature.com/nature.Supplementary Information is linked to the online version of the paper at www.nature.com/nature.Reprints and permissions information is available at www.nature.com/reprints. An increasing amount of information about genetic variation, together with new analytical methods, is making it possible to explore the recent evolutionary history of the human population. The first phase of the International Haplotype Map, including ~1 million single nucleotide polymorphisms (SNPs)7, allowed preliminary examination of natural selection in humans. Now, with the publication of the Phase 2 map (HapMap2)1 in a companion paper, over 3 million SNPs have been genotyped in 420 chromosomes from three continents (120 European (CEU), 120 African (YRI) and 180 Asian from Japan and China (JPT + CHB)). Europe PMC Funders GroupIn our analysis of HapMap2, we first implemented two widely used tests that detect recent positive selection by finding common alleles carried on unusually long haplotypes2. The two, the Long-Range Haplotype (LRH)8 and the integrated Haplotype Score (iHS)9 tests...
A haplotype map of the human genomeThe International HapMap Consortium* Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.
Individual differences in DNA sequence are the genetic basis of human variability. We have characterized whole-genome patterns of common human DNA variation by genotyping 1,586,383 single-nucleotide polymorphisms (SNPs) in 71 Americans of European, African, and Asian ancestry. Our results indicate that these SNPs capture most common genetic variation as a result of linkage disequilibrium, the correlation among common SNP alleles. We observe a strong correlation between extended regions of linkage disequilibrium and functional genomic elements. Our data provide a tool for exploring many questions that remain regarding the causal role of common human DNA variation in complex human traits and for investigating the nature of genetic variation within and between human populations.
A dense map of genetic variation in the laboratory mouse genome will provide insights into the evolutionary history of the species and lead to an improved understanding of the relationship between inter-strain genotypic and phenotypic differences. Here we resequence the genomes of four wild-derived and eleven classical strains. We identify 8.27 million high-quality single nucleotide polymorphisms (SNPs) densely distributed across the genome, and determine the locations of the high (divergent subspecies ancestry) and low (common subspecies ancestry) SNP-rate intervals for every pairwise combination of classical strains. Using these data, we generate a genome-wide haplotype map containing 40,898 segments, each with an average of three distinct ancestral haplotypes. For the haplotypes in the classical strains that are unequivocally assigned ancestry, the genetic contributions of the Mus musculus subspecies--M. m. domesticus, M. m. musculus, M. m. castaneus and the hybrid M. m. molossinus--are 68%, 6%, 3% and 10%, respectively; the remaining 13% of haplotypes are of unknown ancestral origin. The considerable regional redundancy of the SNP data will facilitate imputation of the majority of these genotypes in less-densely typed classical inbred strains to provide a complete view of variation in additional strains.
We have identified a previously undetected cis-acting element in the mouse p-major globin promoter region that is necessary for maximal transcription levels of the gene in the inducible preerythroid murine erythroleukemia (MEL) cell line. This element, termed the j-globin direct-repeat element (jDRE), consists of a directly repeated 10-base-pair sequence, 5'-AGGGCAG(G)AGC-3', that lies just upstream from the TATA box of the promoter. The 13DRE motif is highly conserved in all adult mammalian P-globin promoter sequences known. Mutation of either single repeat alone caused less than a twofold decrease in transcript levels. However, simultaneous mutation of both repeated regions resulted in a ninefold decrease in accumulated transcripts when the gene was transiently transfected into MEL cells. Attachment of the ,IDRE to a heterologous promoter had little effect on levels of accumulated transcripts initiated from the promoter in undifferentiated MEL cells but resulted in a threefold increase in transcript levels in induced (differentiated) MEL cells. Similarly, a comparison of the relative effects of mutations in the 13DRE in uninduced and induced MEL cells indicated that the element was more active in induced cells. The increase in OiDRE activity upon MEL cell differentiation and the more pronounced effects of mutations in both repeats of the iDRE have implications for the mechanism of action of the element in regulating 13-globin transcription and for mutational studies of other repetitive or redundant transcription elements.The developmental-stage and tissue-specific regulation of the mammalian ,B-globin locus has provided a challenging system for understanding the molecular mechanisms that mediate gene activation during terminal differentiation. An important step in understanding this regulation is identification of the cis-acting regulatory sequences that are required for expression of globin genes in erythroid cells. The murine erythroleukemia (MEL) cell model system for adult erythrocyte development (13) has been useful in this characterization for the adult ,B-globin genes. MEL cells are arrested at the proerythroblast stage of erythroid development and can be induced to terminally differentiate in vitro in a process that closely mimicks the events of normal erythropoiesis (for a review, see reference 31). MEL cell differentiation is characterized by a large increase (10-to 50-fold) in the steady-state level of ,B-globin mRNA. This increase is due in part to an increase in the rate of transcriptional initiation from the ,B-globin promoter (3,19,47) and in part to an increase in globin mRNA stability (44). Because transfected 1-globin genes are regulated similarly to the endogenous genes when transferred into MEL cells (3, 46), it has been possible to identify cis-acting elements that play a role in 1-globin transcriptional regulation by mutagenesis experiments (1,4,6,47).Introduction of cloned hybrid genes into transgenic mice (24, 43) and MEL cells (1, 5, 6, 47) has implicated sequences both 5' and 3' to th...
The gene for glycoprotein gB2 of herpes simplex virus type 2 strain 333 was cloned, sequenced, and expressed in mammalian cells. The gB2 protein had an overall nucleotide and amino acid sequence homology of 86% with the cognate gBl protein. However, of the 125 amino acid substitutions or deletions, only 12.5% were conservative replacements. These differences were clustered within an NH2-terminal region, a central region, and a COOH-terminal region, resulting in domains of near identity broken by small regions of marked divergence. Regions of greatest homology included a 90-amino-acid stretch starting at residue 484 and 39 amino acids spanning residues 835 to 873, which cover a rate-of-entry locus mapped to Ala-552 and a syn locus mapped to Arg-857, respectively, in gBl by Bzik et al. (D.
The gene for glycoprotein gBl of herpes simplex virus type 1 strain Patton was expressed in stable Chinese hamster ovary cell lines. Expression vectors containing the dihydrofolate reductase (dhfr) cDNA plus the complete gBl gene or a truncated gene lacking the 194 carboxyl-terminal amino acids of gBl were transfected into CHO DHFR-deficient cells. Radioimmunoprecipitation demonstrated that the complete gBl protein expressed in CHO cell lines was cell associated, whereas the truncated protein was secreted from the cells due to deletion of the transmembrane and C-terminal domains of gBl. Cells expressing the truncated gBl protein were subjected to stepwise methotrexate selection, and a cell line was isolated in which the gBl gene copy number had been amplified 10-fold and the level of expression of gBl had increased over 60-fold. The truncated gBl protein was purified from medium conditioned by the amplified cell line. N-terminal amino acid sequence analysis of this purified protein identified the signal peptide cleavage site and predicted the cleavage of a 30-amino-acid signal sequence from the primary protein. The immunogenicity of the truncated gBl protein was also tested in mice, and high levels of antibody and protection from virus challenge were observed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.