2010
DOI: 10.1093/bioinformatics/btq229
|View full text |Cite
|
Sign up to set email alerts
|

A low-polynomial algorithm for assembling clusters of orthologous groups from intergenomic symmetric best matches

Abstract: Motivation: Identifying orthologous genes in multiple genomes is a fundamental task in comparative genomics. Construction of intergenomic symmetrical best matches (SymBets) and joining them into clusters is a popular method of ortholog definition, embodied in several software programs. Despite their wide use, the computational complexity of these programs has not been thoroughly examined.Results: In this work, we show that in the standard approach of iteration through all triangles of SymBets, the memory scale… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

2
174
0

Year Published

2011
2011
2017
2017

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 197 publications
(176 citation statements)
references
References 41 publications
2
174
0
Order By: Relevance
“…Aside from the increase in the number and size of ATGCs due to the inclusion of new genomes of bacteria and archaea, the major difference between this and previous ATGC versions was the exclusion of lower-quality drafts of incomplete genomes. In addition, the pangenome of each ATGC is now represented by automatically derived COGs [67,68,82].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Aside from the increase in the number and size of ATGCs due to the inclusion of new genomes of bacteria and archaea, the major difference between this and previous ATGC versions was the exclusion of lower-quality drafts of incomplete genomes. In addition, the pangenome of each ATGC is now represented by automatically derived COGs [67,68,82].…”
Section: Methodsmentioning
confidence: 99%
“…and protein coverage of 75% [82]. Second, proteins unassigned in the first stage were added to the cluster that they match best using the COGNITOR method [83], with the stringent threshold e-value 1 × 10 −20 and protein coverage of 75%.…”
Section: Methodsmentioning
confidence: 99%
“…For delineation of the core genes and pan-genomes, a database of all the predicted proteins from the whole faustovirus genome was created. Protein clusters were built using COG triangles (36) and OrthoMCL (37) clustering algorithms (38), and the core genes and pan-genome were defined using GET_HOMOLOGUES software (38) with the following parameters: 75% minimum coverage and 30% minimum identity for the pairwise sequence alignments, with 1eϪ05 as the maximum E value.…”
Section: Flow Cytometric Analyses (I) Detection Of Amoeba Lysis By Fmentioning
confidence: 99%
“…In previous work, we constructed Ͼ2,000 phage orthologous groups (POGs), including genes from Ͼ500 phage genomes (24)(25)(26). We found that despite the ability of phages to acquire genes from their bacterial hosts, at least half of these POGs consist of genes that were never or only very rarely observed in bacteria outside recently integrated prophages.…”
mentioning
confidence: 99%