2018
DOI: 10.1371/journal.pone.0192504
|View full text |Cite
|
Sign up to set email alerts
|

Comparison of SNP-based subtyping workflows for bacterial isolates using WGS data, applied to Salmonella enterica serotype Typhimurium and serotype 1,4,[5],12:i:-

Abstract: Whole genome sequencing represents a promising new technology for subtyping of bacterial pathogens. Besides the technological advances which have pushed the approach forward, the last years have been marked by considerable evolution of the whole genome sequencing data analysis methods. Prior to application of the technology as a routine epidemiological typing tool, however, reliable and efficient data analysis strategies need to be identified among the wide variety of the emerged methodologies. In this work, w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
28
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 31 publications
(28 citation statements)
references
References 61 publications
(76 reference statements)
0
28
0
Order By: Relevance
“…Mapping, variant calling and phylogenetic analysis were performed by locally installed CFSAN SNP Pipeline v1.0.0 (40), an analysis workflow developed by the U.S Food and Drug Administration (FDA). CFSAN pipeline employs a 2-phase variant calling workflow and the ‘optimized’ version of the pipeline with criteria as applied by Saltykova et al (higher stringency in allele frequency thresholds and coverage) (41) was applied. In the first phase, variants were called based on mpileup function of SAMTools and mpileup2snp tool from VarScan (minimum average base quality=20, minimum read depth of coverage at site=12, minimum allele frequency=90%).…”
Section: Methodsmentioning
confidence: 99%
“…Mapping, variant calling and phylogenetic analysis were performed by locally installed CFSAN SNP Pipeline v1.0.0 (40), an analysis workflow developed by the U.S Food and Drug Administration (FDA). CFSAN pipeline employs a 2-phase variant calling workflow and the ‘optimized’ version of the pipeline with criteria as applied by Saltykova et al (higher stringency in allele frequency thresholds and coverage) (41) was applied. In the first phase, variants were called based on mpileup function of SAMTools and mpileup2snp tool from VarScan (minimum average base quality=20, minimum read depth of coverage at site=12, minimum allele frequency=90%).…”
Section: Methodsmentioning
confidence: 99%
“…Instead, they represent numerous diverse clades, with many being endemic to particular regions or jurisdictions (1723). This inherent pathogen population structure poses a challenge to a successful transition to genomics in public health labs because it can significantly reduce the resolution of phylogenomic analyses by affecting the identification of genetic variants (24). Thus, catalogues of complete genomes will only be effective in supporting a transition to genomics in public health labs if they are rich in endemic strains.…”
Section: Tablementioning
confidence: 99%
“…This is a second problem that slows down the adoption of WGS in routine, i.e., the absence of an established methodology for the evaluation and comparison of WGS data analysis schemes. The different implementations of the methods described above can produce different outputs depending on the applied data processing steps and settings (Lüth et al, 2018;Saltykova et al, 2018). Studies which determine the performance characteristics such as reproducibility and discriminatory power of the WGS subtyping pipelines similarly to the classical methods, or that use various metrics to measure similarity between the SNP distance matrices and phylogenies generated by the different data analysis workflows have only recently started to emerge (David et al, 2016;Henri et al, 2017;Katz et al, 2017;Pearce et al, 2018;Saltykova et al, 2018).…”
Section: Introductionmentioning
confidence: 99%
“…The different implementations of the methods described above can produce different outputs depending on the applied data processing steps and settings (Lüth et al, 2018;Saltykova et al, 2018). Studies which determine the performance characteristics such as reproducibility and discriminatory power of the WGS subtyping pipelines similarly to the classical methods, or that use various metrics to measure similarity between the SNP distance matrices and phylogenies generated by the different data analysis workflows have only recently started to emerge (David et al, 2016;Henri et al, 2017;Katz et al, 2017;Pearce et al, 2018;Saltykova et al, 2018). Other important features such as the stability of the data analysis workflows toward the characteristics of the input sequencing data, the effect of the reference genome used for subtyping on the output, or the suitability of a workflow for subtyping of isolates with a particular relatedness level, are currently rarely assessed.…”
Section: Introductionmentioning
confidence: 99%