2014
DOI: 10.1093/nar/gku1223
|View full text |Cite
|
Sign up to set email alerts
|

Expanded microbial genome coverage and improved protein family annotation in the COG database

Abstract: Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
1,096
0
14

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 1,311 publications
(1,112 citation statements)
references
References 74 publications
(78 reference statements)
2
1,096
0
14
Order By: Relevance
“…To gain an overview of its main functions, we next analyzed the A. actinomycetemcomitans essential genome for enriched Clusters of Orthologous Groups (COG) functional categories (49). As expected, VT1169 and 624 were enriched for functions known to be vital for bacterial life (50), such as lipid metabolism and cell wall biogenesis (COGs I and M) (Fig.…”
Section: Resultsmentioning
confidence: 83%
“…To gain an overview of its main functions, we next analyzed the A. actinomycetemcomitans essential genome for enriched Clusters of Orthologous Groups (COG) functional categories (49). As expected, VT1169 and 624 were enriched for functions known to be vital for bacterial life (50), such as lipid metabolism and cell wall biogenesis (COGs I and M) (Fig.…”
Section: Resultsmentioning
confidence: 83%
“…To further test the model consistency with the empirical data, steady-state distributions were calculated for subsets of genes. The subsets were chosen based on the functional classes of genes as classified in the COG (Clusters of Orthologous Genes) database (17,18). The selection landscape and k i were optimized for the best log-likelihood fit of the distribution predicted by the model to the genomic data (SI Appendix, Table S5).…”
Section: Resultsmentioning
confidence: 99%
“…In particular, we analyzed genes that are associated with specific cellular functions as classified in the COG database (17,18) under the assumption that functionally similar genes evolve under similar selection landscapes. We obtained good fits to the model for all functional classes, indicating that the conclusion on the typical beneficial effect of gene acquisition applies to functionally diverse classes of genes.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Only 308 462 (11.70%) of the genes in the database appeared fragmented and 658 (16.65%) were missing. We also 309 obtained a high number of orthologous proteins in both the SwissProt and the X. tropicalis databases that 310 fully matched (100% alignment coverage) or nearly fully corresponded (>80% alignment coverage) to 311 unigenes in O. cruralis (Figure 3) The database of Clusters of Orthologous Groups (COGs) is another common tool for functional 447 annotation (Galperin et al 2015). In this database, orthologous genes from 722 prokaryote genomes are 448 grouped according to their biological function.…”
mentioning
confidence: 99%