Defining worldwide human genetic variation is a critical step to reveal how genome plasticity contributes to disease. Yet, there is currently no metric to assess the representativeness and completeness of current and widely used data on genetic variation. We show here that Human Leukocyte Antigen (HLA) genes can serve as such metric as they are both the most polymorphic and the most studied genetic system. As a test case, we investigated the 1,000 Genomes Project panel. Using high-accuracy in silico HLA typing, we find that over 20% of the common HLA variants and over 70% of the rare HLA variants are missing in this reference panel for worldwide genetic variation, due to undersampling and incomplete geographical coverage, in particular in Oceania and West Asia. Because common and rare variants both contribute to disease, this study thus illustrates how HLA diversity can detect and help fix incomplete sampling and hence accelerate efforts to draw a comprehensive overview of the genetic variation that is relevant to health and disease.
Genome scans represent powerful approaches to investigate the action of natural selection on the genetic variation of natural populations and to better understand local adaptation. This is very useful, for example, in the field of conservation biology and evolutionary biology. Thanks to Next Generation Sequencing, genomic resources are growing exponentially, improving genome scan analyses in non-model species. Thousands of SNPs called using Reduced Representation Sequencing are increasingly used in genome scans. Besides, genome sequences are also becoming increasingly available, allowing better processing of short-read data, offering physical localization of variants, and improving haplotype reconstruction and data imputation. Ultimately, genome sequences are also becoming the raw material for selection inferences. Here, we discuss how the increasing availability of such genomic resources, notably genome sequences, influences the detection of signals of selection. Mainly, increasing data density and having the information of physical linkage data expand genome scans by (i) improving the overall quality of the data, (ii) helping the reconstruction of demographic history for the population studied to decrease false-positive rates and (iii) improving the statistical power of methods to detect the signal of selection. Of particular importance, the availability of a high-quality reference genome can improve the detection of the signal of selection by (i) allowing matching the potential candidate loci to linked coding regions under selection, (ii) rapidly moving the investigation to the gene and function and (iii) ensuring that the highly variable regions of the genomes that include functional genes are also investigated. For all those reasons, using reference genomes in genome scan analyses is highly recommended.
Lateral gene transfers (LGT), species to species transmission of genes by means other than direct inheritance from a common ancestor, have played significant role in shaping prokaryotic genomes and are involved in gain or transfer of important biological processes. Whether LGT significantly contributed to the composition of an animal genome is currently unclear. In nematodes, multiple LGT are suspected to have favored emergence of plant-parasitism. With the availability of whole genome sequences it is now possible to assess whether LGT have significantly contributed to the composition of an animal genome and to establish a comprehensive list of these events. We generated clusters of homologous genes and automated phylogenetic inference, to detect LGT in the genomes of root-knot nematodes and found that up to 3.34% of the genes originate from LGT of non-metazoan origin. After their acquisition, the majority of genes underwent series of duplications. Compared to the rest of the genes in these species, several predicted functional categories showed a skewed distribution in the set of genes acquired via LGT. Interestingly, functions related to metabolism, degradation or modification of carbohydrates or proteins were substantially more frequent. This suggests that genes involved in these processes, related to a parasitic lifestyle, have been more frequently fixed in these parasites after their acquisition. Genes from soil bacteria, including plant-pathogens were the most frequent closest relatives, suggesting donors were preferentially bacteria from the rhizosphere. Several of these bacterial genes are plasmid-borne, pointing to a possible role of these mobile genetic elements in the transfer mechanism. Our analysis provides the first comprehensive description of the ensemble of genes of non-metazoan origin in an animal genome. Besides being involved in important processes regarding plant-parasitism, genes acquired via LGT now constitute a substantial proportion of protein-coding genes in these nematode genomes.
BackgroundHorizontal gene transfer (HGT) is considered to be a major force driving the evolutionary history of prokaryotes. HGT is widespread in prokaryotes, contributing to the genomic repertoire of prokaryotic organisms, and is particularly apparent in Rickettsiales genomes. Gene gains from both distantly and closely related organisms play crucial roles in the evolution of bacterial genomes. In this work, we focus on genes transferred from distantly related species into Rickettsiales species.ResultsWe developed an automated approach for the detection of HGT from other organisms (excluding alphaproteobacteria) into Rickettsiales genomes. Our systematic approach consisted of several specialized features including the application of a parsimony method for inferring phyletic patterns followed by blast filter, automated phylogenetic reconstruction and the application of patterns for HGT detection. We identified 42 instances of HGT in 31 complete Rickettsiales genomes, of which 38 were previously unidentified instances of HGT from Anaplasma, Wolbachia, Candidatus Pelagibacter ubique and Rickettsia genomes. Additionally, putative cases with no phylogenetic support were assigned gene ontology terms. Overall, these transfers could be characterized as “rhizome-like”.ConclusionsOur analysis provides a comprehensive, systematic approach for the automated detection of HGTs from several complete proteome sequences that can be applied to detect instances of HGT within other genomes of interest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.