2004
DOI: 10.1111/j.1462-2920.2004.00624.x
|View full text |Cite
|
Sign up to set email alerts
|

Application of tetranucleotide frequencies for the assignment of genomic fragments

Abstract: A basic problem of the metagenomic approach in microbial ecology is the assignment of genomic fragments to a certain species or taxonomic group, when suitable marker genes are absent. Currently, the (G + C)-content together with phylogenetic information and codon adaptation for functional genes is mostly used to assess the relationship of different fragments. These methods, however, can produce ambiguous results. In order to evaluate sequence-based methods for fragment identification, we extensively compared (… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

10
321
1
2

Year Published

2005
2005
2013
2013

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 327 publications
(335 citation statements)
references
References 27 publications
10
321
1
2
Order By: Relevance
“…To validate the method of using BLAST for contaminant removal, we investigated whether contaminant contigs showed distinct tetranucleotide frequencies, which are known to vary among organisms (Teeling et al 2004). Our results show that those contigs removed using BLAST do have distinct tetranucleotide frequencies and are therefore unlikely to be part of the target genome (Supplemental Fig.…”
Section: Gmds Produce Better Sequencing Results Than Single Cellsmentioning
confidence: 99%
“…To validate the method of using BLAST for contaminant removal, we investigated whether contaminant contigs showed distinct tetranucleotide frequencies, which are known to vary among organisms (Teeling et al 2004). Our results show that those contigs removed using BLAST do have distinct tetranucleotide frequencies and are therefore unlikely to be part of the target genome (Supplemental Fig.…”
Section: Gmds Produce Better Sequencing Results Than Single Cellsmentioning
confidence: 99%
“…doe.gov/ including assembly methods, base error and possible mis-assembly corrections. Contigs were further analyzed using nucleotide word frequency (Teeling et al, 2004) principal component analysis (NWF-PCA) to separate novel archaeal group 1 (NAG1) contigs from other species (performed at http://gos.jcvi.org/ openAccess/scatterPlotViewer2.html with the following parameters: word size ¼ four, minimum contig size at 2000 bases, and the chop sequence size at 4000 bases as described previously for YNP ferric iron mat systems; Inskeep et al, 2010). Further verification of NAG1 contigs was obtained by plotting coverage vs G þ C content for all metagenome contigs, which clearly verified the NAG1 sequences as separate from other populations.…”
Section: Methodsmentioning
confidence: 99%
“…Comparison of NAG1 sequence to reference databases (via blastp or blastn) revealed a consistent pattern (that is, poor sequence similarity to current reference genomes), and is clearly different than the other three to four predominant populations present in these systems (Kozubal et al, 2012). The assemblies were evaluated using NWF-PCA (Teeling et al, 2004), which showed that the sequence content and character (that is, G þ C content, codon usage) are nearly identical among the four NAG1 replicates, and that these assemblies differ significantly compared with four representative phyla within the domain Archaea, including those from the Crenarchaeota, Euryarchaeota and Thaumarchaeota (Figure 1b).…”
Section: Metagenome Assembliesmentioning
confidence: 99%
“…Oligonucleotide patterns were determined to obtain phylogenetic signals (Teeling et al, 2004) by counting the frequencies of all possible tri-, tetra-, pentaand hexa-nucleotide combinations for each scaffold X20 000 bp. Frequency counts were normalized by the length of the respective scaffold and subjected to k-means clustering (Kanungo et al, 2002) using the a priori value of k equal to 8 (see Supplementary Information Section 3 for rationale).…”
Section: Clustering and Characterization Of Assembliesmentioning
confidence: 99%
“…Oligonucleotide frequency patterns contain phylogenetic information (Pride et al, 2003;Teeling et al, 2004) and have been used as a tool to determine phylogenetic signatures in metagenomic data from microbial communities (Woyke et al, 2006;Wilmes et al, 2008;Dick et al, 2009;Inskeep et al, 2010). Annotation of open reading frames was used to identify phylogenetically and functionally informative genes in the scaffolds.…”
Section: Introductionmentioning
confidence: 99%