2016
DOI: 10.1099/mgen.0.000056
|View full text |Cite
|
Sign up to set email alerts
|

SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments

Abstract: Rapidly decreasing genome sequencing costs have led to a proportionate increase in the number of samples used in prokaryotic population studies. Extracting single nucleotide polymorphisms (SNPs) from a large whole genome alignment is now a routine task, but existing tools have failed to scale efficiently with the increased size of studies. These tools are slow, memory inefficient and are installed through non-standard procedures. We present SNP-sites which can rapidly extract SNPs from a multi-FASTA alignment … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
908
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 963 publications
(909 citation statements)
references
References 14 publications
0
908
0
1
Order By: Relevance
“…Reads from bneohCR1 and bneohCR2 genomes mapped to 98.6% of the B. suis 1330 genome. SNPs were called from the alignment by use of Samtools (http://samtools.sourceforge.net/), and 34,307 variable sites across all isolates were extracted by using SNP sites ( 13 ). The resulting alignment of SNPs was used for maximum-likelihood phylogenetic reconstruction by use of RAxML version 7.0.4 (https://github.com/stamatak/standard-RAxML).…”
Section: The Studymentioning
confidence: 99%
“…Reads from bneohCR1 and bneohCR2 genomes mapped to 98.6% of the B. suis 1330 genome. SNPs were called from the alignment by use of Samtools (http://samtools.sourceforge.net/), and 34,307 variable sites across all isolates were extracted by using SNP sites ( 13 ). The resulting alignment of SNPs was used for maximum-likelihood phylogenetic reconstruction by use of RAxML version 7.0.4 (https://github.com/stamatak/standard-RAxML).…”
Section: The Studymentioning
confidence: 99%
“…SNP data were generated from a single multiple FASTA alignment of nucleotides for each species and each locus separately through SNP sites (v2.4.0; Page et al, ). We filtered this SNP set using vcftools (v0.1.13; Danecek et al, ) to select genotypes that were called in all samples and had no missing data.…”
Section: Methodsmentioning
confidence: 99%
“…The consensus pseudo-sequences were generated from each BAM file and aligned with other sequences to generate whole-genome alignment of the study isolates. We adjusted for recombination and used Gubbins 29 to remove recombination sites from the whole-genome alignment and then generated an alignment of polymorphic (variable) sites only using SNP-Sites 30 . An ML phylogenetic tree was then reconstructed using this SNP alignment, again using RAxML with the same parameters used to construct the core-gene alignment tree.…”
Section: Methodsmentioning
confidence: 99%