Mycobacterium tuberculosis, the causative agent of tuberculosis (TB), is a significant burden on global health. Antibiotic treatment imposes strong selective pressure on M. tuberculosis populations. Identifying the mutations that cause drug resistance in M. tuberculosis is important for guiding TB treatment and halting the spread of drug resistance. Whole-genome sequencing (WGS) of M. tuberculosis isolates can be used to identify novel mutations mediating drug resistance and to predict resistance patterns faster than traditional methods of drug susceptibility testing. We have used WGS from natural populations of drug-resistant M. tuberculosis to characterize effects of selection for advantageous mutations on patterns of diversity at genes involved in drug resistance. The methods developed here can be used to identify novel advantageous mutations, including new resistance loci, in M. tuberculosis and other clonal pathogens.
Although numerous algorithms have been developed to identify structural variations (SVs) in genomic sequences, there is a dearth of approaches that can be used to evaluate their results. This is significant as the accurate identification of structural variation is still an outstanding but important problem in genomics. The emergence of new sequencing technologies that generate longer sequence reads can, in theory, provide direct evidence for all types of SVs regardless of the length of the region through which it spans. However, current efforts to use these data in this manner require the use of large computational resources to assemble these sequences as well as visual inspection of each region. Here we present VaPoR, a highly efficient algorithm that autonomously validates large SV sets using long-read sequencing data. We assessed the performance of VaPoR on SVs in both simulated and real genomes and report a high-fidelity rate for overall accuracy across different levels of sequence depths. We show that VaPoR can interrogate a much larger range of SVs while still matching existing methods in terms of false positive validations and providing additional features considering breakpoint precision and predicted genotype. We further show that VaPoR can run quickly and efficiency without requiring a large processing or assembly pipeline. VaPoR provides a long read-based validation approach for genomic SVs that requires relatively low read depth and computing resources and thus will provide utility with targeted or low-pass sequencing coverage for accurate SV assessment. The VaPoR Software is available at: https://github.com/mills-lab/vapor.
Background The main goal of this collaborative effort is to provide genome-wide data for the previously underrepresented population in Eastern Europe, and to provide cross-validation of the data from genome sequences and genotypes of the same individuals acquired by different technologies. We collected 97 genome-grade DNA samples from consented individuals representing major regions of Ukraine that were consented for public data release. BGISEQ-500 sequence data and genotypes by an Illumina GWAS chip were cross-validated on multiple samples and additionally referenced to 1 sample that has been resequenced by Illumina NovaSeq6000 S4 at high coverage. Results The genome data have been searched for genomic variation represented in this population, and a number of variants have been reported: large structural variants, indels, copy number variations, single-nucletide polymorphisms, and microsatellites. To our knowledge, this study provides the largest to-date survey of genetic variation in Ukraine, creating a public reference resource aiming to provide data for medical research in a large understudied population. Conclusions Our results indicate that the genetic diversity of the Ukrainian population is uniquely shaped by evolutionary and demographic forces and cannot be ignored in future genetic and biomedical studies. These data will contribute a wealth of new information bringing forth a wealth of novel, endemic and medically related alleles.
Mycobacteria have a distinct secretion system, termed type VII (T7SS), which is encoded by paralogous chromosomal loci (ESX) and associated with pathogenesis, conjugation, and metal homeostasis. Evolution of paralogous gene families is of interest because duplication is an important mechanism by which novel genes evolve, but there are potential conflicts between adaptive forces that stabilize duplications and those that enable evolution of new functions. Our objective was to delineate the adaptive forces underlying diversification of T7SS. Plasmid-borne ESX were described recently, and we found evidence that the initial duplication and divergence of ESX systems occurred on plasmids and was driven by selection for advantageous mutations. Plasmid conjugation has been linked to T7SS and type IV secretion systems (T4SS) in mycobacteria, and we discovered that T7SS and T4SS genes evolved in concert on the plasmids. We hypothesize that differentiation of plasmid ESX helps to prevent conjugation among cells harboring incompatible plasmids. Plasmid ESX appear to have been repurposed following migration to the chromosome, and there is evidence of positive selection driving further differentiation of chromosomal ESX. We hypothesize that ESX loci were initially stabilized on the chromosome by mediating their own transfer. These results emphasize the diverse adaptive paths underlying evolution of novelty, which in this case involved plasmid duplications, selection for advantageous mutations in the mobile and core genomes, migration of the loci between plasmids and chromosomes, and lateral transfer among chromosomes. We discuss further implications for the choice of model organism to study ESX functions in Mycobacterium tuberculosis.
Mycobacteria have a distinct secretion system, termed type VII (T7SS), which is encoded by paralogous chromosomal loci (ESX) and associated with pathogenesis, conjugation, and metal homeostasis. Evolution of paralogous gene families is of interest because duplication is an important mechanism by which novel genes evolve, but there are potential conflicts between adaptive forces that stabilize duplications and those that enable evolution of new functions. Our objective was to delineate the adaptive forces underlying diversification of T7SS. Plasmid-borne ESX were described recently, and we found evidence that the initial duplication and divergence of ESX systems occurred on plasmids and was driven by selection for advantageous mutations. Plasmid conjugation has been linked to T7SS and type IV secretion systems (T4SS) in mycobacteria, and we discovered that T7SS and T4SS genes evolved in concert on the plasmids. We hypothesize that differentiation of plasmid ESX helps to prevent conjugation among cells harboring incompatible plasmids. Plasmid ESX appear to have been repurposed following migration to the chromosome, and there is evidence of positive selection driving further differentiation of chromosomal ESX. We hypothesize that ESX loci were initially stabilized on the chromosome by mediating their own transfer. These results emphasize the diverse adaptive paths underlying evolution of novelty, which in this case involved plasmid duplications, selection for advantageous mutations in the mobile and core genomes, migration of the loci between plasmids and chromosomes, and lateral transfer among chromosomes. We discuss further implications for the choice of model organism to study ESX functions in Mycobacterium tuberculosis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.