The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2014
DOI: 10.7717/peerj.620
|View full text |Cite
|
Sign up to set email alerts
|

An evaluation of alternative methods for constructing phylogenies from whole genome sequence data: a case study withSalmonella

Abstract: Comparative genomics based on whole genome sequencing (WGS) is increasingly being applied to investigate questions within evolutionary and molecular biology, as well as questions concerning public health (e.g., pathogen outbreaks). Given the impact that conclusions derived from such analyses may have, we have evaluated the robustness of clustering individuals based on WGS data to three key factors: (1) next-generation sequencing (NGS) platform (HiSeq, MiSeq, IonTorrent, 454, and SOLiD), (2) algorithms used to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 45 publications
(51 citation statements)
references
References 49 publications
0
51
0
Order By: Relevance
“…For the case of this E. coli dataset, the phylogeny inferred by Mugsy, a reference-independent approach, was in topological agreement with other reference-dependent approaches (Table 2). In fact, kSNPv3 was one of the only methods that returned a topology that was inconsistent with all other methods (Table 2); an inconsistent kSNP phylogeny has also been reported in the analysis of other datasets (Pettengill et al, 2014). To analyze this further, we identified SNPs (n = 826) from the NASP run using simulated paired-end reads that were uniquely shared on a branch of the phylogeny that defines a monophyletic lineage (Fig.…”
Section: Pipeline Comparisons On E Coli Genomes Data Setmentioning
confidence: 97%
See 2 more Smart Citations
“…For the case of this E. coli dataset, the phylogeny inferred by Mugsy, a reference-independent approach, was in topological agreement with other reference-dependent approaches (Table 2). In fact, kSNPv3 was one of the only methods that returned a topology that was inconsistent with all other methods (Table 2); an inconsistent kSNP phylogeny has also been reported in the analysis of other datasets (Pettengill et al, 2014). To analyze this further, we identified SNPs (n = 826) from the NASP run using simulated paired-end reads that were uniquely shared on a branch of the phylogeny that defines a monophyletic lineage (Fig.…”
Section: Pipeline Comparisons On E Coli Genomes Data Setmentioning
confidence: 97%
“…Previously, it has been demonstrated that different phylogenies can be obtained for the same dataset using either RAxML or FastTree2 (Pettengill et al, 2014). To test this result across multiple phylogenetic inference methods, the NASP E. coli read dataset was used.…”
Section: Phylogeny Differences For the Same Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…In the United States, nationwide real-time whole-genome sequencing (WGS) was implemented using the GenomeTrakr and PulseNet network to enhance listeriosis outbreak detection and investigation (14). In several outbreak investigations, the U.S. Centers for Disease Control and Prevention (CDC) had employed a whole-genome multilocus sequence typing (wgMLST) tool that targets the allelic differences in genome-wide coding regions (14), and the U.S. Food and Drug Administration (FDA) had employed a reference-based Center for Food Safety and Applied Nutrition (CFSAN) SNP Pipeline that identifies single nucleotide polymorphisms (SNPs) in the entire genome, including core genes, accessory genes, and intergenic regions (8, 11, 15). …”
Section: Introductionmentioning
confidence: 99%
“…This eliminates any biases potentially introduced due to the selection of a reference and allows for the detection of SNVs not present in the reference genome. However, as noted by Pettengill et al, a reference-free approach may lead to a higher SNV false discovery rate without appropriate thresholds (177). The software package kSNP (178,179) takes a reference-free approach to identifying SNVs by breaking up each genomic data set into k-mers and comparing these k-mers.…”
Section: Phylogenetics To Phylogenomicsmentioning
confidence: 99%