Abstract:20Pathogen genomic data are increasingly used to characterize global and local transmission patterns of important 21 human pathogens and to inform public health interventions. Yet there is no current consensus on how to measure 22 genomic variation. We investigated the effects of variant identification approaches on transmission inferences for 23 M. tuberculosis by comparing variants identified by five different groups in the same sequence data from a clonal 24 outbreak. We then measured the performance of com… Show more
“…1 ). In addition, the inclusion of two synthetic FASTQ files generated from an edited reference sequence ( 22 ) and differing by three single SNPs, one double nucleotide change, and one 3-bp insertion (sample 12 and 13 in Table 3 ) were reported by participating laboratories as differing by a number of SNPs ranging from one to seven, which is in line with previous findings ( 21 ). These minor differences between pipelines, although having only minor effects when detecting potentially epidemiologically linked isolates, may have implications when inferring transmission chains as the natural accumulation of mutations in M. tuberculosis is extremely slow ( 9 ).…”
Section: Discussionsupporting
confidence: 87%
“…Nonetheless, some laboratories did report structurally high SNP distances for closely related isolates, even if the isolates were correctly reported as potentially related. This is likely due to incomplete filtering of noise in the SNP-calling algorithm, often caused by not excluding all poorly mapped reads, used by different laboratories, which makes the comparison of precise SNP distances provided by the EQA participants potentially misleading ( 21 ).…”
The wider availability of whole-genome sequencing (WGS) coupled to new developments in bioinformatic tools and databases to interpret
Mycobacterium tuberculosis
complex WGS data has accelerated the adoption of this method for the routine prediction of antimycobacterial drug resistance and genotyping, thus necessitating the establishment of a comprehensive external quality control system. Here, we report 4 years of development and results from such a panel.
“…1 ). In addition, the inclusion of two synthetic FASTQ files generated from an edited reference sequence ( 22 ) and differing by three single SNPs, one double nucleotide change, and one 3-bp insertion (sample 12 and 13 in Table 3 ) were reported by participating laboratories as differing by a number of SNPs ranging from one to seven, which is in line with previous findings ( 21 ). These minor differences between pipelines, although having only minor effects when detecting potentially epidemiologically linked isolates, may have implications when inferring transmission chains as the natural accumulation of mutations in M. tuberculosis is extremely slow ( 9 ).…”
Section: Discussionsupporting
confidence: 87%
“…Nonetheless, some laboratories did report structurally high SNP distances for closely related isolates, even if the isolates were correctly reported as potentially related. This is likely due to incomplete filtering of noise in the SNP-calling algorithm, often caused by not excluding all poorly mapped reads, used by different laboratories, which makes the comparison of precise SNP distances provided by the EQA participants potentially misleading ( 21 ).…”
The wider availability of whole-genome sequencing (WGS) coupled to new developments in bioinformatic tools and databases to interpret
Mycobacterium tuberculosis
complex WGS data has accelerated the adoption of this method for the routine prediction of antimycobacterial drug resistance and genotyping, thus necessitating the establishment of a comprehensive external quality control system. Here, we report 4 years of development and results from such a panel.
“…First of all, while generally all pe and ppe genes are excluded from bioinformatic datasets, this approach is overly stringent in most cases. Even short‐read sequencing techniques can reliably map almost all pe and ppe genes thanks to paired‐end technologies and increased read lengths, if only pe_pgrs and ppe‐mptr genes/transcripts are excluded (Miran and Farhat – personal communication, Holt et al , ; Ates et al , ; Walter et al , ). Furthermore, knowing the subgroup of a PE/PPE can be an excellent starting point to hypothesize the most‐likely route of secretion and may even suggest functions, or redundancy.…”
Section: Secretion and Functions Of Specific Pe And Ppe Protein Subgrmentioning
The PE and PPE proteins of Mycobacterium tuberculosis have been studied with great interest since their discovery. Named after the conserved proline (P) and glutamic acid (E) residues in their N-terminal domains, these proteins are postulated to perform wide-ranging roles in virulence and immune modulation. However, technical challenges in studying these proteins and their encoding genes have hampered the elucidation of molecular mechanisms and leave many open questions regarding the biological functions mediated by these proteins. Here, I review the shared and unique characteristics of PE and PPE proteins from a molecular perspective linking this information to their functions in mycobacterial virulence. I discuss how the different subgroups (PE_PGRS, PPE-PPW, PPE-SVP and PPE-MPTR) are defined and why this classification of paramount importance to understand the PE and PPE proteins as individuals and or groups. The goal of this MicroReview is to summarize and structure the existing information on this gene family into a simplified framework of thinking about PE and PPE proteins and genes. Thereby, I hope to provide helpful starting points in studying these genes and proteins for researchers with different backgrounds. This has particular implications for the design and monitoring of novel vaccine candidates and in understanding the evolution of the M. tuberculosis complex.
“…Exporting a simulated outbreak 6. This simulation protocol of TransPhylo is especially useful to simulate outbreaks and benchmark how accurate methods of inference (either Basic Protocol 3 of TransPhylo or some other method) are likely to be when applied to real datasets (Ness et al, 2019;Stimson et al, 2019;Walter et al, 2019). In this case, it is important to make sure that the parameters used for the simulation are realistic for the pathogen of interest in the real data.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.