2012
DOI: 10.1186/1471-2105-13-61
|View full text |Cite
|
Sign up to set email alerts
|

EnzML: multi-label prediction of enzyme classes using InterPro signatures

Abstract: BackgroundManual annotation of enzymatic functions cannot keep up with automatic genome sequencing. In this work we explore the capacity of InterPro sequence signatures to automatically predict enzymatic function.ResultsWe present EnzML, a multi-label classification method that can efficiently account also for proteins with multiple enzymatic functions: 50,000 in UniProt. EnzML was evaluated using a standard set of 300,747 proteins for which the manually curated Swiss-Prot and KEGG databases have agreeing Enzy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
46
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 44 publications
(47 citation statements)
references
References 28 publications
1
46
0
Order By: Relevance
“…Exclusive reads per the method above for LineP summer photic viromes, summer aphotic viromes, winter photic viromes, and winter aphotic viromes were compared against the similarity matrix of proteins (SIMAP) released on August 20, 2013 (57) using BLASTX (58) to assign function as previously described (22). Briefly, these analyses were implemented using a custom data analysis pipeline written in Perl and bash shell and executed on a high-performance computer using PBSPro (blastpipeline_ simap.tar).…”
Section: Construction Of Euler Diagrams Depicting Shared Read Contentmentioning
confidence: 99%
See 1 more Smart Citation
“…Exclusive reads per the method above for LineP summer photic viromes, summer aphotic viromes, winter photic viromes, and winter aphotic viromes were compared against the similarity matrix of proteins (SIMAP) released on August 20, 2013 (57) using BLASTX (58) to assign function as previously described (22). Briefly, these analyses were implemented using a custom data analysis pipeline written in Perl and bash shell and executed on a high-performance computer using PBSPro (blastpipeline_ simap.tar).…”
Section: Construction Of Euler Diagrams Depicting Shared Read Contentmentioning
confidence: 99%
“…Interpro ids in the SIMAP functional annotation were mapped to EC numbers using the swissprot_kegg_proteins_ec.csv as a mapping (59) (ipr_to_ec.pl). Read hit counts were normalized based on sequencing effort in the included viromes and converted into ipath2 format (create_ipath.pl) for visual representation in the ipath2 viewer (58).…”
Section: Construction Of Euler Diagrams Depicting Shared Read Contentmentioning
confidence: 99%
“…Sequenced-based protein function prediction has hundreds of thousands of annotated sequences available [18]. …”
Section: Methodsmentioning
confidence: 99%
“…Indeed, the circularity of the combined process of propagating annotations and then predicting function, based on the same annotations and homologies, may be problematic. Sequence-based enzyme function predictions based on EC number annotations in databases can indeed give very impressive results [35] and such predictive exercises can be extended to include mechanism [36], both processes usually operating mostly via the detection of homology -although 3D structure-based methods also exist [37,38,39]. Using mechanisms and catalytic chains as defined in MACiE, the corresponding UniProt sequences are interrogated against InterPro signatures [29] to re-express the MACiE entries in terms of the signatures present in them.…”
Section: Protein Function Predictionmentioning
confidence: 99%