Whole genome sequencing (WGS) of Mycobacterium tuberculosis has rapidly evolved from a research tool to a clinical application for the diagnosis and management of tuberculosis and in public health surveillance. This evolution has been facilitated by the dramatic drop in costs, advances in technology, and concerted efforts to translate sequencing data into actionable information. There is however a risk that, in the absence of a consensus and international standards, the widespread use of WGS technology may result in data and processes that lack harmonisation, comparability and validation. In this review, we outline the current landscape of WGS pipelines and applications and set out best practices for M. tuberculosis WGS, including standards for bioinformatics pipelines, curated repository of resistance-causing variants, phylogenetic analyses, quality control processes, and standardised reporting. 1. Introduction Mycobacterium tuberculosis complex (Mtbc) pathogens are collectively the top infectious disease killer globally, causing 10 million new tuberculosis (TB) cases annually 1. Increasingly, 95 new TB cases are already resistant to rifampicin and isoniazid (termed multidrug resistance; 96 MDR-TB), the key first line drugs 1. Tackling the spread and drug resistance burden of this pathogen requires concerted global effort in prevention, diagnosis, treatment and surveillance.
Human tuberculosis (TB) is caused by members of the Mycobacterium tuberculosis complex (MTBC). The MTBC comprises several human-adapted lineages known as M. tuberculosis sensu stricto, as well as two lineages (L5 and L6) traditionally referred to as Mycobacterium africanum . Strains of L5 and L6 are largely limited to West Africa for reasons unknown, and little is known of their genomic diversity, phylogeography and evolution. Here, we analysed the genomes of 350 L5 and 320 L6 strains, isolated from patients from 21 African countries, plus 5 related genomes that had not been classified into any of the known MTBC lineages. Our population genomic and phylogeographical analyses showed that the unclassified genomes belonged to a new group that we propose to name MTBC lineage 9 (L9). While the most likely ancestral distribution of L9 was predicted to be East Africa, the most likely ancestral distribution for both L5 and L6 was the Eastern part of West Africa. Moreover, we found important differences between L5 and L6 strains with respect to their phylogeographical substructure and genetic diversity. Finally, we could not confirm the previous association of drug-resistance markers with lineage and sublineages. Instead, our results indicate that the association of drug resistance with lineage is most likely driven by sample bias or geography. In conclusion, our study sheds new light onto the genomic diversity and evolutionary history of M. africanum , and highlights the need to consider the particularities of each MTBC lineage for understanding the ecology and epidemiology of TB in Africa and globally.
The human-and animal-adapted lineages of the Mycobacterium tuberculosis complex (MTBC) are thought to have expanded from a common progenitor in Africa. However, the molecular events that accompanied this emergence remain largely unknown. Here, we describe two MTBC strains isolated from patients with multidrug resistant tuberculosis, representing an as-yet-unknown lineage, named Lineage 8 (L8), seemingly restricted to the African Great Lakes region. Using genome-based phylogenetic reconstruction, we show that L8 is a sister clade to the known MTBC lineages. Comparison with other complete mycobacterial genomes indicate that the divergence of L8 preceded the loss of the cobF genome region-involved in the cobalamin/vitamin B12 synthesis-and gene interruptions in a subsequent common ancestor shared by all other known MTBC lineages. This discovery further supports an East African origin for the MTBC and provides additional molecular clues on the ancestral genome reduction associated with adaptation to a pathogenic lifestyle.
BackgroundTracking recent transmission is a vital part of controlling widespread pathogens such as Mycobacterium tuberculosis. Multiple methods with specific performance characteristics exist for detecting recent transmission chains, usually by clustering strains based on genotype similarities. With such a large variety of methods available, informed selection of an appropriate approach for determining transmissions within a given setting/time period is difficult.MethodsThis study combines whole genome sequence (WGS) data derived from 324 isolates collected 2005–2010 in Kinshasa, Democratic Republic of Congo (DRC), a high endemic setting, with phylodynamics to unveil the timing of transmission events posited by a variety of standard genotyping methods. Clustering data based on Spoligotyping, 24-loci MIRU-VNTR typing, WGS based SNP (Single Nucleotide Polymorphism) and core genome multi locus sequence typing (cgMLST) typing were evaluated.FindingsOur results suggest that clusters based on Spoligotyping could encompass transmission events that occurred almost 200 years prior to sampling while 24-loci-MIRU-VNTR often represented three decades of transmission. Instead, WGS based genotyping applying low SNP or cgMLST allele thresholds allows for determination of recent transmission events, e.g. in timespans of up to 10 years for a 5 SNP/allele cut-off.InterpretationWith the rapid uptake of WGS methods in surveillance and outbreak tracking, the findings obtained in this study can guide the selection of appropriate clustering methods for uncovering relevant transmission chains within a given time-period. For high resolution cluster analyses, WGS-SNP and cgMLST based analyses have similar clustering/timing characteristics even for data obtained from a high incidence setting.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.