2014
DOI: 10.1371/journal.pone.0104579
|View full text |Cite
|
Sign up to set email alerts
|

Choice of Reference Sequence and Assembler for Alignment of Listeria monocytogenes Short-Read Sequence Data Greatly Influences Rates of Error in SNP Analyses

Abstract: The wide availability of whole-genome sequencing (WGS) and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs) in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs) are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps) are mitigated is of utmost importance. The choices rese… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
63
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 78 publications
(68 citation statements)
references
References 39 publications
4
63
0
Order By: Relevance
“…Hence, SNP-based analysis has been very useful in describing the transmission of a pathogen within a single or closely related outbreak (Harris et al 2013;Walker et al 2013;Schmid et al 2014) but not when applied to distantly related isolates. Also, the quality of the raw data generated, the method of assembly, and the reference strains influence the output of SNPs (Pightling et al 2014). On the basis of the 4 above mentioned principles, several bioinformatics tools have been devised and employed to determine the relation between the strains.…”
Section: Discussionmentioning
confidence: 99%
“…Hence, SNP-based analysis has been very useful in describing the transmission of a pathogen within a single or closely related outbreak (Harris et al 2013;Walker et al 2013;Schmid et al 2014) but not when applied to distantly related isolates. Also, the quality of the raw data generated, the method of assembly, and the reference strains influence the output of SNPs (Pightling et al 2014). On the basis of the 4 above mentioned principles, several bioinformatics tools have been devised and employed to determine the relation between the strains.…”
Section: Discussionmentioning
confidence: 99%
“…One of the least computationally intensive ways to detect SNPs is referenceguided assembly. After the reads have been mapped to a reference genome, a SNP caller can be used to identify the SNPs between the genomes; however, this introduces the issue of reference bias, and the choice of reference genome is extremely important (31). It is also possible to do a SNP analysis of de novo-assembled genomes.…”
Section: Single Nucleotide Polymorphism-based Analysismentioning
confidence: 99%
“…For comparisons and benchmark tests for aligners see [53][54][55][56][57] and the excellent review [50]. The list of aligners is updated online [58].…”
Section: Aligning Reads To a Reference Genome And/or Assembly Readsmentioning
confidence: 99%