2017
DOI: 10.1101/gr.213959.116
|View full text |Cite
|
Sign up to set email alerts
|

metaSPAdes: a new versatile metagenomic assembler

Abstract: While metagenomics has emerged as a technology of choice for analyzing bacterial populations, the assembly of metagenomic data remains challenging, thus stifling biological discoveries. Moreover, recent studies revealed that complex bacterial populations may be composed from dozens of related strains, thus further amplifying the challenge of metagenomic assembly. metaSPAdes addresses various challenges of metagenomic assembly by capitalizing on computational ideas that proved to be useful in assemblies of sing… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

9
2,451
0
7

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 3,161 publications
(2,672 citation statements)
references
References 87 publications
9
2,451
0
7
Order By: Relevance
“…However, that study used whole-metagenome-assembly-based approaches to achieve strain-level taxonomic resolution of the STEC in the samples. Whole-metagenome assembly is a computationally intensive, time-consuming process, as illustrated by Nurk et al, who recently reported that metagenome assembly can take between 1.5 h and 6 h, with a memory footprint ranging from 7.3 GB to 234.5 GB, depending on the chosen assembler, for processing of a single human gut metagenomic sample (28). Thus, the application of more rapid, less intensive bioinformatic tools for strain detection is desirable.…”
Section: Discussionmentioning
confidence: 99%
“…However, that study used whole-metagenome-assembly-based approaches to achieve strain-level taxonomic resolution of the STEC in the samples. Whole-metagenome assembly is a computationally intensive, time-consuming process, as illustrated by Nurk et al, who recently reported that metagenome assembly can take between 1.5 h and 6 h, with a memory footprint ranging from 7.3 GB to 234.5 GB, depending on the chosen assembler, for processing of a single human gut metagenomic sample (28). Thus, the application of more rapid, less intensive bioinformatic tools for strain detection is desirable.…”
Section: Discussionmentioning
confidence: 99%
“…Because of the stochastic 21 sampling nature associated with WGS and the presence of sequencing errors, it is 22 necessary for the reads to cover a single gene or genome many times (coverage), 23 typically 30x-50x, to ensure high quality de novo assembly [2]. Unlike in single genome 24 sequencing projects where the majority of the genomic regions are equally represented, 25 in transcriptome and metagenome sequencing projects, different species of transcripts or 26 genomes may have very unequal representation, up to several orders of magnitude in one would sequence the population at much higher depth than single genome projects. 29 As in practice it is difficult to precisely estimate the required sequencing depth without 30 knowing the community structure, sequencing large transcriptomes and complex 31 metagenomes often generates as much data as the budget allows, producing 100-1000 32 GB of sequence data or more [15] [33].…”
mentioning
confidence: 99%
“…(MetaSPAdes [26], MEGAHIT [19], etc) or MPI to distributed on a cluster [11]. The 47 shared memory approach is very hard to scale up to exponentially increased NGS data 48 size.…”
mentioning
confidence: 99%
“…Metagenomic sequence data is analyzed by an ever-increasing range of bioinformatic tools. These tools are able to perform a wide variety of applications, for example, quality control analysis of the raw sequence data 28 , overlapping of paired end reads 29 , de novo assembly of sequence reads to contigs and scaffolds 30,31 , taxonomic classification and visualization of sequence reads and assembled sequences 7,12,32,33 and the functional annotation of assembled sequences 34,35 .…”
mentioning
confidence: 99%