Contaminant DNA in bacterial sequencing experiments is a major source of false genetic variability

Goig, Galo A.; Blanco, Silvia; García‐Basteiro, Alberto L.; Comas, Iñaki

doi:10.1186/s12915-020-0748-z

Cited by 55 publications

(43 citation statements)

References 62 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Other groups have identified methods to reduce additional sources of error in genomic epidemiology studies. For example, taxonomic filtering can importantly exclude reads from contaminating microbial species [49]. Additionally, other work has found that calling variants for samples independently rather than jointly may improve sensitivity for detecting low-frequency microbial variants [50].…”

Section: Discussionmentioning

confidence: 99%

Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences

et al. 2020

View full text Add to dashboard Cite

Pathogen genomic data are increasingly used to characterize global and local transmission patterns of important human pathogens and to inform public health interventions. Yet, there is no current consensus on how to measure genomic variation. To test the effect of the variant-identification approach on transmission inferences for Mycobacterium tuberculosis, we conducted an experiment in which five genomic epidemiology groups applied variant-identification pipelines to the same outbreak sequence data. We compared the variants identified by each group in addition to transmission and phylogenetic inferences made with each variant set. To measure the performance of commonly used variant-identification tools, we simulated an outbreak. We compared the performance of three mapping algorithms, five variant callers and two variant filters in recovering true outbreak variants. Finally, we investigated the effect of applying increasingly stringent filters on transmission inferences and phylogenies. We found that variant-calling approaches used by different groups do not recover consistent sets of variants, which can lead to conflicting transmission inferences. Further, performance in recovering true variation varied widely across approaches. While no single variant-identification approach outperforms others in both recovering true genome-wide and outbreak-level variation, variant-identification algorithms calibrated upon real sequence data or that incorporate local reassembly outperform others in recovering true pairwise differences between isolates. The choice of variant filters contributed to extensive differences across pipelines, and applying increasingly stringent filters rapidly eroded the accuracy of transmission inferences and quality of phylogenies reconstructed from outbreak variation. Commonly used approaches to identify M. tuberculosis genomic variation have variable performance, particularly when predicting potential transmission links from pairwise genetic distances. Phylogenetic reconstruction may be improved by less stringent variant filtering. Approaches that improve variant identification in repetitive, hypervariable regions, such as long-read assemblies, may improve transmission inference.

show abstract

Section: Discussionmentioning

confidence: 99%

Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences

et al. 2020

View full text Add to dashboard Cite

show abstract

“…Given the potential presence of contaminant DNA not corresponding to MTBC, the Kraken software V2 13 was first used to classify the WGS reads. Further focus was directed only at those reads that belonged to MTBC species 14 . The WGS analysis, including mapping and variant calling (SNP and INDELS), was performed following a previously reported pipeline 7 , 15 , which has been described, validated and available online at http://tgu.ibv.csic.es/?page_id=1794 .…”

Section: Methodsmentioning

confidence: 99%

Whole genomic sequencing based genotyping reveals a specific X3 sublineage restricted to Mexico and related with multidrug resistance

Jiménez-Ruano

Madrazo-Moya

Cancino-Muñoz

et al. 2021

Sci Rep

Self Cite

View full text Add to dashboard Cite

Whole genome sequencing (WGS) has been shown to be superior to traditional procedures of genotyping in tuberculosis (TB), nevertheless, reports of its use in drug resistant TB (DR-TB) isolates circulating in Mexico, are practically unknown. Considering the above the main of this work was to identify and characterize the lineages and genomic transmission clusters present in 67 DR-TB isolates circulating in southeastern Mexico. The results show the presence of three major lineages: L1 (3%), L2 (3%) and L4 (94%), the last one included 16 sublineages. Sublineage 4.1.1.3 (X3) was predominant in 18 (27%) of the isolates, including one genomic cluster, formed by eleven multidrug resistant isolates and sharing the SIT 3278, which seems to be restricted to Mexico. By the use of WGS, it was possible to identify the high prevalence of L4 and a high number of sublineages circulating in the region, also was recognized the presence of a novel X3 sublineage, formed exclusively by multidrug resistant isolates and with restrictive circulation in Mexico for at least the past 17 years.

show abstract

“…DNA that is not the focus of the study). While this is an important consideration in single genome studies (Goig et al 2020), it represents a particular challenge in mWGS, where host DNA likely to be highly prevalent in samples and hence constitute a significant fraction of the sequenced reads.…”

Section: Matching Sequences To Reference Databasesmentioning

confidence: 99%

Metagenomics: a path to understanding the gut microbiome

Yen

Johnson

2021

Mamm Genome

View full text Add to dashboard Cite

The gut microbiome is a major determinant of host health, yet it is only in the last 2 decades that the advent of next-generation sequencing has enabled it to be studied at a genomic level. Shotgun sequencing is beginning to provide insight into the prokaryotic as well as eukaryotic and viral components of the gut community, revealing not just their taxonomy, but also the functions encoded by their collective metagenome. This revolution in understanding is being driven by continued development of sequencing technologies and in consequence necessitates reciprocal development of computational approaches that can adapt to the evolving nature of sequence datasets. In this review, we provide an overview of current bioinformatic strategies for handling metagenomic sequence data and discuss their strengths and limitations. We then go on to discuss key technological developments that have the potential to once again revolutionise the way we are able to view and hence understand the microbiome.

show abstract

Contaminant DNA in bacterial sequencing experiments is a major source of false genetic variability

Cited by 55 publications

References 62 publications

Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences

Genomic variant-identification methods may alter Mycobacterium tuberculosis transmission inferences

Whole genomic sequencing based genotyping reveals a specific X3 sublineage restricted to Mexico and related with multidrug resistance

Metagenomics: a path to understanding the gut microbiome

Contact Info

Product

Resources

About