We describe the Phase II HapMap, which characterizes over 3.1 million human single nucleotide polymorphisms (SNPs) genotyped in 270 individuals from four geographically diverse populations and includes 25-35% of common SNP variation in the populations surveyed. The map is estimated to capture untyped common variation with an average maximum r2 of between 0.9 and 0.96 depending on population. We demonstrate that the current generation of commercial genome-wide genotyping products captures common Phase II SNPs with an average maximum r2 of up to 0.8 in African and up to 0.95 in non-African populations, and that potential gains in power in association studies can be obtained through imputation. These data also reveal novel aspects of the structure of linkage disequilibrium. We show that 10-30% of pairs of individuals within a population share at least one region of extended genetic identity arising from recent ancestry and that up to 1% of all common variants are untaggable, primarily because they lie within recombination hotspots. We show that recombination rates vary systematically around genes and between genes of different function. Finally, we demonstrate increased differentiation at non-synonymous, compared to synonymous, SNPs, resulting from systematic differences in the strength or efficacy of natural selection between populations.
With the advent of dense maps of human genetic variation, it is now possible to detect positive natural selection across the human genome. Here we report an analysis of over 3 million polymorphisms from the International HapMap Project Phase 2 (HapMap2)1. We used 'longrange haplotype' methods, which were developed to identify alleles segregating in a population that have undergone recent selection2, and we also developed new methods that are based on cross-population comparisons to discover alleles that have swept to near-fixation within a population. The analysis reveals more than 300 strong candidate regions. Focusing on the strongest 22 regions, we develop a heuristic for scrutinizing these regions to identify candidate targets of selection. In a complementary analysis, we identify 26 non-synonymous, coding, single nucleotide polymorphisms showing regional evidence of positive selection. Examination of these candidates highlights three cases in which two genes in a common biological process have apparently undergone positive selection in the same population: LARGE and DMD, both related to infection by the Lassa virus3, in West Africa; SLC24A5 and SLC45A2, both involved in skin pigmentation4,5, in Europe; and EDAR and EDA2R, both involved in development of hair follicles6, in Asia. ©2007 Nature Publishing GroupCorrespondence and requests for materials should be addressed to P.C.S. (pardis@broad.mit.edu).. * These authors contributed equally to this work. † Lists of participants and affiliations appear at the end of the paper. Author Contributions P.C.S., P.V., B.F. and E.S.L. initiated the project. P.V., B.F. and P.C.S. developed key software. P.C.S., P.V., B.F., S.F.S., J.L., E.H., C.C., X.X., E.B., S.A.McC. and R.G. performed analysis. P.C.S., E.B. and E.H. performed experiments. P.C.S., E.S.L., P.V. and S.F.S. wrote the manuscript.Full Methods and any associated references are available in the online version of the paper at www.nature.com/nature.Supplementary Information is linked to the online version of the paper at www.nature.com/nature.Reprints and permissions information is available at www.nature.com/reprints. An increasing amount of information about genetic variation, together with new analytical methods, is making it possible to explore the recent evolutionary history of the human population. The first phase of the International Haplotype Map, including ~1 million single nucleotide polymorphisms (SNPs)7, allowed preliminary examination of natural selection in humans. Now, with the publication of the Phase 2 map (HapMap2)1 in a companion paper, over 3 million SNPs have been genotyped in 420 chromosomes from three continents (120 European (CEU), 120 African (YRI) and 180 Asian from Japan and China (JPT + CHB)). Europe PMC Funders GroupIn our analysis of HapMap2, we first implemented two widely used tests that detect recent positive selection by finding common alleles carried on unusually long haplotypes2. The two, the Long-Range Haplotype (LRH)8 and the integrated Haplotype Score (iHS)9 tests...
A haplotype map of the human genomeThe International HapMap Consortium* Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.
By impairing both function and survival, the severe reduction in oxygen availability associated with high-altitude environments is likely to act as an agent of natural selection. We used genomic and candidate gene approaches to search for evidence of such genetic selection. First, a genome-wide allelic differentiation scan (GWADS) comparing indigenous highlanders of the Tibetan Plateau (3,200-3,500 m) with closely related lowland Han revealed a genome-wide significant divergence across eight SNPs located near EPAS1. This gene encodes the transcription factor HIF2α, which stimulates production of red blood cells and thus increases the concentration of hemoglobin in blood. Second, in a separate cohort of Tibetans residing at 4,200 m, we identified 31 EPAS1 SNPs in high linkage disequilibrium that correlated significantly with hemoglobin concentration. The sex-adjusted hemoglobin concentration was, on average, 0.8 g/dL lower in the major allele homozygotes compared with the heterozygotes. These findings were replicated in a third cohort of Tibetans residing at 4,300 m. The alleles associating with lower hemoglobin concentrations were correlated with the signal from the GWADS study and were observed at greatly elevated frequencies in the Tibetan cohorts compared with the Han. High hemoglobin concentrations are a cardinal feature of chronic mountain sickness offering one plausible mechanism for selection. Alternatively, as EPAS1 is pleiotropic in its effects, selection may have operated on some other aspect of the phenotype. Whichever of these explanations is correct, the evidence for genetic selection at the EPAS1 locus from the GWADS study is supported by the replicated studies associating function with the allelic variants.chronic mountain sickness | high altitude | human genome variation | hypoxia | hypoxia-inducible factor
Computationally-efficient semilocal approximations of density functional theory at the level of the local spin density approximation (LSDA) or generalized gradient approximation (GGA) poorly describe weak interactions. We show improved descriptions for weak bonds (without loss of accuracy for strong ones) from a newly-developed semilocal meta-GGA (MGGA), by applying it to molecules, surfaces, and solids. We argue that this improvement comes from using the right MGGA dimensionless ingredient to recognize all types of orbital overlap.PACS numbers: 34.20.Gj, 31.15.E-, 87.15.ADue to its computational efficiency and reasonable accuracy, the Kohn-Sham density functional theory [1][2][3] with semilocal approximations to the exchangecorrelation energy, e.g., the local spin density approximation (LSDA) [4,5] and the standard Perdew-BurkeErnzerhof (PBE) generalized gradient approximation (GGA) [6], is one of the most widely-used electronic structure methods in materials science, surface science, condensed matter physics, and chemistry. Semilocal approximations display a well-understood error cancellation between exchange and correlation in bonding regions. Thus some intermediate-range correlation effects, important for strong and weak bonds, are carried by the exchange part of the approximation. However, it is well-known that these approximations cannot yield correct long-range asymptotic dispersion forces [7]. This raises doubts about the suitability of semilocal approximations for the description of weak interactions (including hydrogen bonds and van der Waals interactions), even near equilibrium where most interesting properties occur. These doubts are supported by the performance of LSDA and GGAs, which are not very useful for many important systems and properties (such as DNA, physisorption on surfaces, most biochemistry, etc.).However, these doubts are challenged by recent developments in semilocal meta-GGAs (MGGA) [8][9][10][11][12][13][14] (which are useful by themselves and as ingredients of hybrid functionals [14]). Compared to GGAs, which use the density n(r) and its gradient ∇n as inputs, MGGAs additionally include the positive kinetic energy density τ = k |∇ψ k | 2 /2 of the occupied orbitals ψ k . For simplicity, we suppress the spin here. By including training sets of noncovalent interactions, the moleculeoriented and heavily-parameterized M06L MGGA was trained to capture medium-range exchange and correlation energies that dominate equilibrium structures of noncovalent complexes [9]. Madsen et al. showed that the inclusion of the kinetic energy densities enables MGGAs to discriminate between dispersive and covalent interactions, which makes the M06L MGGA [9] suitable for layered materials bonded by van der Waals interactions [15,16]. Besides improvement for noncovalent bonds, simultaneous improvement for metallic and covalent bonds is also an outstanding problem for semilocal functionals [17,18]. Ref. 18 has shown that the revised Tao-Perdew-Staroverov-Scuseria (revTPSS) [10] MGGA, due to the inclusion of ...
The Tao-Perdew-Staroverov-Scuseria (TPSS) meta-generalized-gradient-approximation (MGGA) and its revised version, the revTPSS, are implemented self-consistently within the framework of the projector-augmented-wave (PAW) method, using a plane wave basis set.Both TPSS and revTPSS yield accurate atomization energies for the molecules in the AE6 set, better than those of the standard Perdew-Burke-Ernzerhof (PBE) generalized-gradientapproximation. For lattice constants and bulk moduli of 20 diverse solids, revTPSS performs much better than PBE, and on average as well as PBEsol and Armiento-Mattsson (AM05), GGAs designed for solids. The latter two overestimate the atomization energies for molecules to an unacceptable degree. However, the revTPSS presents only a slight improvement over PBEsol for the prediction of cohesive energies for solids, and some deterioration with respect to PBE. We also study the magnetic properties of Fe, for which both TPSS and revTPSS predict the right ground-state solid phase, the ferromagnetic body-centered-cubic (bcc) structure, with an accurate magnetic moment.
In a standard Kohn-Sham density functional calculation, the total energy of a crystal at zero temperature is evaluated for a perfect static lattice of nuclei, and minimized with respect to the lattice constant. Sometimes a zero-point vibrational energy, whose anharmonicity expands the minimizing or equilibrium lattice constant, is included in the calculation or (as here) used to correct the experimental reference value for the lattice constant to that for a static lattice. A simple model for this correction, based on the Debye and Dugdale-MacDonald approximations, requires as input only readily-available parameters of the equation of state, plus the experimental Debye temperature. However, due in particular to the rough Dugdale-MacDonald estimation of Grüneisen parameters for diatomic solids, this simple model is found to overestimate the correction by about a factor of two for some solids in the diamond and zinc-blende structures. Using the quasi-harmonic phonon frequencies calculated from density functional perturbation theory gives a more accurate zero-point anharmonic expansion (ZPAE) correction. The error statistics for the lattice constants of various semilocal density functionals for the exchange-correlation energy are however little changed by improving the ZPAE correction. The PerdewBurke-Ernzerhof generalized gradient approximation (GGA) for solids (PBEsol) and the revised TaoPerdew-Staroverov-Scuseria (revTPSS) meta-GGA, the latter implemented here selfconsistently in BAND, applied to a test set of 58 solids, remain the most accurate of the functionals tested, with mean absolute relative errors below 0.7% for the lattice constants. The most positive and most negative revTPSS relative errors tend to occur for solids where full nonlocality (missing from revTPSS) may be most important.
Among the computationally efficient semilocal density functionals for the exchange-correlation energy, meta-generalized-gradient approximations (meta-GGAs) are potentially the most accurate. Here, we assess the performance of three new meta-GGAs (revised Tao-Perdew-Staroverov-Scuseria or revTPSS, regularized revTPSS or regTPSS, and meta-GGA made simple or MGGA_MS), within and beyond their "comfort zones," on Grimme's big test set of main-group molecular energetics (thermochemistry, kinetics, and noncovalent interactions). We compare them against the standard Perdew-Burke-Ernzerhof (PBE) GGA, TPSS, and Minnesota M06L meta-GGAs, and Becke-3-Lee-Yang-Parr (B3LYP) hybrid of GGA with exact exchange. The overall performance of these three new meta-GGA functionals is similar. However, dramatic differences occur for different test sets. For example, M06L and MGGA_MS perform best for the test sets that contain noncovalent interactions. For the 14 Diels-Alder reaction energies in the "difficult" DARC subset, the mean absolute error ranges from 3 kcal mol(-1) (MGGA_MS) to 15 kcal mol(-1) (B3LYP), while for some other reaction subsets the order of accuracy is reversed; more generally, the tested new semilocal functionals outperform the standard B3LYP for ring reactions. Some overall improvement is found from long-range dispersion corrections for revTPSS and regTPSS but not for MGGA_MS. Formal and universality criteria for the functionals are also discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.