2012
DOI: 10.1093/bib/bbs054
|View full text |Cite
|
Sign up to set email alerts
|

Classification of metagenomic sequences: methods and challenges

Abstract: Characterizing the taxonomic diversity of microbial communities is one of the primary objectives of metagenomic studies. Taxonomic analysis of microbial communities, a process referred to as binning, is challenging for the following reasons. Primarily, query sequences originating from the genomes of most microbes in an environmental sample lack taxonomically related sequences in existing reference databases. This absence of a taxonomic context makes binning a very challenging task. Limitations of current seque… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

2
145
0
1

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 203 publications
(152 citation statements)
references
References 56 publications
2
145
0
1
Order By: Relevance
“…Sequence data can be segregated into bins representing distinct Operational Taxonomic Units (OTU), which may mean an individual organism or species, or a group that shares a certain set of observed characters. Two classes of binning processes include similarity-based (or align-based) and composition-based strategies [59]. Similarity-based binning searches for sequence similarities between samples and reference genomes in existing public databases.…”
Section: Network Inferencementioning
confidence: 99%
“…Sequence data can be segregated into bins representing distinct Operational Taxonomic Units (OTU), which may mean an individual organism or species, or a group that shares a certain set of observed characters. Two classes of binning processes include similarity-based (or align-based) and composition-based strategies [59]. Similarity-based binning searches for sequence similarities between samples and reference genomes in existing public databases.…”
Section: Network Inferencementioning
confidence: 99%
“…Moreover, we are presented with fragmented assemblies due to insufficient coverage, sequencing errors, sequence repetition, and genetic diversity. Consequently, alignment-free techniques [1], [2] have been introduced as an alternative way to analyse metagenomic data [3] by incorporating species-specific genomic signatures extracted by calculating the normalised frequency of k-mers of a specific size, e.g., commonly k = 4. This frequency is obtained by counting the occurrences of each k-mer combination and represents a feature vector in high-dimensional space.…”
Section: Introductionmentioning
confidence: 99%
“…In general, there are two methods to detect the taxonomic content of environmental samples: (1) sequencing phylogenetic marker genes, e.g. 16S rRNA, that requires PCR amplicons analysis; (2) Next Generation Sequencing, where all the genomic material of the sample is sequenced.…”
Section: Introductionmentioning
confidence: 99%