This paper presents a meta-analysis of high-resolution human leukocyte antigen (HLA) allele frequency data describing 497 population samples. Most of the datasets were compiled from studies published in eight journals from 1990 to 2007; additional datasets came from the International Histocompatibility Workshops and from the AlleleFrequencies.net database. In all, these data represent approximately 66,800 individuals from throughout the world, providing an opportunity to observe trends that may not have been evident at the time the data were originally analyzed, especially with regard to the relative importance of balancing selection among the HLA loci. Population genetic measures of allele frequency distributions were summarized across populations by locus and geographic region. A role for balancing selection maintaining much of HLA variation was confirmed. Further, the breadth of this meta-analysis allowed the ranking of the HLA loci, with DQA1 and HLA-C showing strongest balancing selection and DPB1 being compatible with neutrality. Comparisons of the allelic spectra reported by studies since 1990 suggest that most of the HLA alleles identified since 2000 are very-low-frequency alleles. The literature-based allele-count data, as well as maps summarizing the geographic distributions for each allele, are available online.
We have updated the catalogue of common and well-documented (CWD) HLA alleles to reflect current understanding of the prevalence of specific allele sequences. The original CWD catalogue designated 721 alleles at the HLA-A, -B, -C, -DRB1, -DRB3/4/5, -DQA1, -DQB1, and –DPB1 loci in IMGT/HLA Database release 2.15.0 as being CWD. The updated CWD catalogue designates 1122 alleles at the HLA-A, -B, -C, -DRB1, -DRB3/4/5, -DQA1, -DQB1, -DPA1 and –DPB1 loci as being CWD, and represents 14.3% of the HLA alleles in IMGT/HLA Database release 3.9.0. In particular, we identified 415 of these alleles as being “common” (having known frequencies) and 707 as being “well-documented” on the basis of ~140,000 sequence-based typing observations and available HLA haplotype data. Using these allele prevalence data, we have also assigned CWD status to specific G and P designations. We identified 147/151 G groups and 290/415 P groups as being CWD. The CWD catalogue will be updated on a regular basis moving forward, and will incorporate changes to the IMGT/HLA Database as well as empirical data from the histocompatibility and immunogenetics community. This version 2.0.0 of the CWD catalogue is available online at cwd.immunogenomics.org, and will be integrated into the Allele Frequencies Net Database, the IMGT/HLA Database and National Marrow Donor Program’s bioinformatics web pages.
Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC) genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies.Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model). However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used to explore the genetic history of human populations, and that their analysis allows a more thorough investigation of human MHC molecular evolution.
Funding information Australian Government Research Training Program Stipend (RTPS); AustralianWe report detailed peptide-binding affinities between 438 HLA Class I and Class II proteins and complete proteomes of seven pandemic human viruses, including coronaviruses, influenza viruses and HIV-1. We contrast these affinities with HLA allele frequencies across hundreds of human populations worldwide. Statistical modelling shows that peptide-binding affinities classified into four distinct categories depend on the HLA locus but that the type of virus is only a weak predictor, except in the case of HIV-1. Among the strong HLA binders (IC 50 ≤ 50), we uncovered 16 alleles (the top ones being A*02:02, B*15:03 and DRB1*01:02) binding more than 1% of peptides derived from all viruses, 9 (top ones including HLA-A*68:01, B*15:25, C*03:02 and DRB1*07:01)
The genes coding for the main molecules involved in the human immune system – immunoglobulins, human leucocyte antigen (HLA) molecules and killer-cell immunoglobulin-like receptors (KIR) – exhibit a very high level of polymorphism that reveals remarkable frequency variation in human populations. 'Genetic marker' (GM) allotypes located in the constant domains of IgG antibodies have been studied for over 40 years through serological typing, leading to the identification of a variety of GM haplotypes whose frequencies vary sharply from one geographic region to another. An impressive diversity of HLA alleles, which results in amino acid substitutions located in the antigen-binding region of HLA molecules, also varies greatly among populations. The KIR differ between individuals according to both gene content and allelic variation, and also display considerable population diversity. Whereas the molecular evolution of these polymorphisms has most likely been subject to natural selection, principally driven by host–pathogen interactions, their patterns of genetic variation worldwide show significant signals of human geographic expansion, demographic history and cultural diversification. As current developments in population genetic analysis and computer simulation improve our ability to discriminate among different – either stochastic or deterministic – forces acting on the genetic evolution of human populations, the study of these systems shows great promise for investigating both the peopling history of modern humans in the time since their common origin and human adaptation to past environmental (e.g. pathogenic) changes. Therefore, in addition to mitochondrial DNA, Y-chromosome, microsatellites, single nucleotide polymorphisms and other markers, immunogenetic polymorphisms represent essential and complementary tools for anthropological studies
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.