2013
DOI: 10.1186/1471-2105-14-s3-s10
|View full text |Cite
|
Sign up to set email alerts
|

Combining heterogeneous data sources for accurate functional annotation of proteins

Abstract: Combining heterogeneous sources of data is essential for accurate prediction of protein function. The task is complicated by the fact that while sequence-based features can be readily compared across species, most other data are species-specific. In this paper, we present a multi-view extension to GOstruct, a structured-output framework for function annotation of proteins. The extended framework can learn from disparate data sources, with each data source provided to the framework in the form of a kernel. Our … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
35
0

Year Published

2014
2014
2019
2019

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 47 publications
(35 citation statements)
references
References 37 publications
0
35
0
Order By: Relevance
“…Several approaches to the predicton of protein functions were proposed in the literature, including sequencebased [10], [11], [12] and network-based methods [13], [14], [15], structured output algorithms based on kernels [3], [16], [17] and hierarchical ensemble methods [18], [19], [20]. In particular, the availability of large-scale networks, in which nodes are genes/proteins and edges their functional pairwise relationships, has promoted the development of several machine learning methods where novel annotations are inferred by exploiting the topology of the resulting biomolecular network.…”
Section: Introductionmentioning
confidence: 99%
“…Several approaches to the predicton of protein functions were proposed in the literature, including sequencebased [10], [11], [12] and network-based methods [13], [14], [15], structured output algorithms based on kernels [3], [16], [17] and hierarchical ensemble methods [18], [19], [20]. In particular, the availability of large-scale networks, in which nodes are genes/proteins and edges their functional pairwise relationships, has promoted the development of several machine learning methods where novel annotations are inferred by exploiting the topology of the resulting biomolecular network.…”
Section: Introductionmentioning
confidence: 99%
“…The prediction of gene function generally proceeds by the transfer of function from genes with experimental evidence to unannotated, or less-annotated, genes that are similar by some measure [42]. While several methods use multiple data types to carry out predictions [31,47,11], many solely rely on evolutionary relationships [16,24,6,10] and are the focus of the current study.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, Pérez et al [40] introduced a dictionary-based system that extracts keywords from the literature or from databases and associates them with GO categories; Other systems used pattern matching and sentence structure to retrieve sentences containing a protein along with Gene Ontology (GO) terms denoting function [8,26]. A recent function prediction system [56] identifies pairs of GO terms and proteins within abstracts, and uses them as part of an integrative similarity measure (kernel) employed in classifying proteins by function. Additional information extraction systems have been used in a variety of knowledge discovery tasks within the biomedical domain (see surveys, e.g., [11,25,52]).…”
Section: Introductionmentioning
confidence: 99%