2017
DOI: 10.1093/bib/bbx120
|View full text |Cite
|
Sign up to set email alerts
|

A review of methods and databases for metagenomic classification and assembly

Abstract: Microbiome research has grown rapidly over the past decade, with a proliferation of new methods that seek to make sense of large, complex data sets. Here, we survey two of the primary types of methods for analyzing microbiome data: read classification and metagenomic assembly, and we review some of the challenges facing these methods. All of the methods rely on public genome databases, and we also discuss the content of these databases and how their quality has a direct impact on our ability to interpret a mic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
357
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 438 publications
(386 citation statements)
references
References 163 publications
(171 reference statements)
2
357
0
Order By: Relevance
“…Ranges of bioinformatics tools facilitate data analysis, e.g., Kraken [75], Kaiju [76], VirusFinder [77]. Different workflows have recently been benchmarked and comprehensively overviewed elsewhere [78][79][80].…”
Section: Problems Of Metagenomic Approachmentioning
confidence: 99%
See 1 more Smart Citation
“…Ranges of bioinformatics tools facilitate data analysis, e.g., Kraken [75], Kaiju [76], VirusFinder [77]. Different workflows have recently been benchmarked and comprehensively overviewed elsewhere [78][79][80].…”
Section: Problems Of Metagenomic Approachmentioning
confidence: 99%
“…The former option is very demanding in terms of computational power, since millions of reads undergo a complex analysis [84]. The latter is subject to faults of its own: partial or low-quality reference sequences, false mapping and high genetic diversity resulting in numerous polymorphisms [79].…”
Section: Problems Of Metagenomic Approachmentioning
confidence: 99%
“…Sequences are then clustered into bins called ‘Operational Taxonomic Units’ based on similarity and are typically annotated based upon a representative sequence . 16S rRNA offers a cheap and fast method to analyse the microbial composition (‘who is there?’), but it incorporates amplification and systematic biases, and is not ideal for comprehensive phylogenetic assignation, and no information about the functional properties of the bacterial communities can be drawn . Alternatively, MGS is accomplished by untargeted sequencing of the whole genome of all microorganisms present in a sample .…”
Section: Microbial Composition and Its Functional Potentialmentioning
confidence: 99%
“…16S rRNA offers a cheap and fast method to analyse the microbial composition (‘who is there?’), but it incorporates amplification and systematic biases, and is not ideal for comprehensive phylogenetic assignation, and no information about the functional properties of the bacterial communities can be drawn . Alternatively, MGS is accomplished by untargeted sequencing of the whole genome of all microorganisms present in a sample . Although more expensive than 16S rRNA, MGS avoids amplification biases and extends the information provided by 16S rRNA, allowing the identification of viruses, fungi and protozoa and the simultaneous characterization of both the microbial composition and genes.…”
Section: Microbial Composition and Its Functional Potentialmentioning
confidence: 99%
“…While being faster in most cases, alignment-free methods are limited to the detection of sequences, whereas alignment-based methods potentially allow for a more detailed characterization of genomes. Existing approaches based on unbiased full genome sequencing of metagenomic samples are facing various obstacles, especially concerning the ranking of the results according to their clinical relevance and the long overall turnaround time [41][42][43][44][45][46][47][48]. A central issue in NGS-based pathogen detection is that the clinically relevant data is very hard to identify.…”
Section: Introductionmentioning
confidence: 99%