2020
DOI: 10.1093/bioinformatics/btaa1051
|View full text |Cite
|
Sign up to set email alerts
|

DeepNOG: fast and accurate protein orthologous group assignment

Abstract: Motivation Protein orthologous group databases are powerful tools for evolutionary analysis, functional annotation or metabolic pathway modeling across lineages. Sequences are typically assigned to orthologous groups with alignment-based methods, such as profile hidden Markov models, which have become a computational bottleneck. Results We present DeepNOG, an extremely fast and accurate, alignment-free orthology assignment me… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(14 citation statements)
references
References 33 publications
0
13
0
Order By: Relevance
“…In particular, we used a very low threshold on redundancy (<= 20% sequence identity). This is a much stricter threshold than applied in related studies on protein sequence classification such as DeepFam (35) and DeepNOG (36) which did not apply any redundancy removal filters, and Bepler and Berger’s study (33) that used a sequence identity filter of 40%, but CATHe achieved a comparable performance.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, we used a very low threshold on redundancy (<= 20% sequence identity). This is a much stricter threshold than applied in related studies on protein sequence classification such as DeepFam (35) and DeepNOG (36) which did not apply any redundancy removal filters, and Bepler and Berger’s study (33) that used a sequence identity filter of 40%, but CATHe achieved a comparable performance.…”
Section: Discussionmentioning
confidence: 99%
“…DeepFam attained a prediction accuracy of 97.17% on the family level on the GPCR dataset, and 95.4% on the COG dataset (with protein families having at least 500 sequences). DeepNOG (36) employed a method similar to DeepFam to classify sequences from COG and eggNOG5 databases. Furthermore, deep learning techniques have been used not just for family-level classification, but also for fold recognition, in FoldHSphere (37).…”
Section: Introductionmentioning
confidence: 99%
“…CSBFinder-S (v0.6.3) (92) was used with the default settings to find the gmk-rpoZ synteny across 23,517 fully sequenced bacterial genomes downloaded from the NCBI genome database. DeepNOG (v1.2.3) (93) was run using the default setting to obtain the COG (Clusters of Orthologous Genes) ID for each gene. Strand information was obtained from the corresponding genomic.gff file for every genome downloaded.…”
Section: Methodsmentioning
confidence: 99%
“…A Clusters of Orthologous Groups of proteins (COG) annotation analysis was performed using HMMER [52]. Gene Ontology (GO) functional enrichment and Kyoto Encyclopedia of Genes and Genome (KEGG) pathway analysis were carried out by Goatools (https://github.com/tanghaibao/Goatools, accessed on 7 June 2022) and KOBAS (http://kobas.cbi.pku.edu.cn/home.do, accessed on 7 June 2022) at a p value ≤ 0.05 or corrected p value ≤ 0.05 [53,54].…”
Section: Functional Annotation and Enrichmentmentioning
confidence: 99%