Alexey Gurevich scite author profile

The lion's share of bacteria in various environments cannot be cloned in the laboratory and thus cannot be sequenced using existing technologies. A major goal of single-cell genomics is to complement gene-centric metagenomic data with whole-genome assemblies of uncultivated organisms. Assembly of single-cell data is challenging because of highly non-uniform read coverage as well as elevated levels of sequencing errors and chimeric reads. We describe SPAdes, a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E + V -SC assembler (specialized for single-cell data) and on popular assemblers Velvet and SoapDeNovo (for multicell data). SPAdes generates single-cell assemblies, providing information about genomes of uncultivatable bacteria that vastly exceeds what may be obtained via traditional metagenomics studies. SPAdes is available online (http://bioinf.spbau.ru/spades). It is distributed as open source software.

show abstract

QUAST: quality assessment tool for genome assemblies

Gurevich

et al. 2013

View full text Add to dashboard Cite

show abstract

Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software

et al. 2017

View full text Add to dashboard Cite

In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.

show abstract

Assembling Single-Cell Genomes and Mini-Metagenomes From Chimeric MDA Products

Nurk

Bankevich

Antipov

et al. 2013

Journal of Computational Biology

1,155

798

View full text Add to dashboard Cite

Recent advances in single-cell genomics provide an alternative to largely gene-centric metagenomics studies, enabling whole-genome sequencing of uncultivated bacteria. However, single-cell assembly projects are challenging due to (i) the highly nonuniform read coverage and (ii) a greatly elevated number of chimeric reads and read pairs. While recently developed single-cell assemblers have addressed the former challenge, methods for assembling highly chimeric reads remain poorly explored. We present algorithms for identifying chimeric edges and resolving complex bulges in de Bruijn graphs, which significantly improve single-cell assemblies. We further describe applications of the single-cell assembler SPAdes to a new approach for capturing and sequencing "microbial dark matter" that forms small pools of randomly selected single cells (called a mini-metagenome) and further sequences all genomes from the mini-metagenome at once. On single-cell bacterial datasets, SPAdes improves on the recently developed E+V-SC and IDBA-UD assemblers specifically designed for single-cell sequencing. For standard (cultivated monostrain) datasets, SPAdes also improves on A5, ABySS, CLC, EULER-SR, Ray, SOAPdenovo, and Velvet. Thus, recently developed single-cell assemblers not only enable single-cell sequencing, but also improve on conventional assemblers on their own turf. SPAdes is available for free online download under a GPLv2 license.

show abstract

Feature-based molecular networking in the GNPS analysis environment

et al. 2020

View full text Add to dashboard Cite

Molecular networking has become a key method to visualize and annotate the chemical space in non-targeted mass spectrometry data. We present Feature-Based Molecular Networking (FBMN) as an analysis method in the Global Natural Products Social Molecular Networking (GNPS) infrastructure that builds on chromatographic feature detection and alignment tools. The FBMN method brings quantitative analyses, isomeric resolution, including from ion-mobility spectrometry, into molecular networks.

show abstract

Versatile genome assembly evaluation with QUAST-LG

et al. 2018

View full text Add to dashboard Cite

MotivationThe emergence of high-throughput sequencing technologies revolutionized genomics in early 2000s. The next revolution came with the era of long-read sequencing. These technological advances along with novel computational approaches became the next step towards the automatic pipelines capable to assemble nearly complete mammalian-size genomes.ResultsIn this manuscript, we demonstrate performance of the state-of-the-art genome assembly software on six eukaryotic datasets sequenced using different technologies. To evaluate the results, we developed QUAST-LG—a tool that compares large genomic de novo assemblies against reference sequences and computes relevant quality metrics. Since genomes generally cannot be reconstructed completely due to complex repeat patterns and low coverage regions, we introduce a concept of upper bound assembly for a given genome and set of reads, and compute theoretical limits on assembly correctness and completeness. Using QUAST-LG, we show how close the assemblies are to the theoretical optimum, and how far this optimum is from the finished reference.Availability and implementation http://cab.spbu.ru/software/quast-lg Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Assembling Genomes and Mini-metagenomes from Highly Chimeric Reads

et al. 2013

View full text Add to dashboard Cite

MetaQUAST: evaluation of metagenome assemblies

2015

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Alexey Gurevich

SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing

QUAST: quality assessment tool for genome assemblies

Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software

Assembling Single-Cell Genomes and Mini-Metagenomes From Chimeric MDA Products

Feature-based molecular networking in the GNPS analysis environment

Versatile genome assembly evaluation with QUAST-LG

Assembling Genomes and Mini-metagenomes from Highly Chimeric Reads

MetaQUAST: evaluation of metagenome assemblies

Contact Info

Product

Resources

About