2015
DOI: 10.1089/cmb.2014.0110
|View full text |Cite
|
Sign up to set email alerts
|

UNIPred: Unbalance-Aware Network Integration and Prediction of Protein Functions

Abstract: The proper integration of multiple sources of data and the unbalance between annotated and unannotated proteins represent two of the main issues of the automated function prediction (AFP) problem. Most of supervised and semisupervised learning algorithms for AFP proposed in literature do not jointly consider these items, with a negative impact on both sensitivity and precision performances, due to the unbalance between annotated and unannotated proteins that characterize the majority of functional classes and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
4

Relationship

5
4

Authors

Journals

citations
Cited by 18 publications
(21 citation statements)
references
References 56 publications
0
21
0
Order By: Relevance
“…Then we selected the GO terms with at least 20 proteins in C np , obtaining the terms summarized in Table 2. The proposed features were also tested in terms of their capability in predicting the protein functions, by selecting GO terms with 20–200 annotations in the later release, in order to have a minimum of information to train a classifier, and to exclude terms with a large number of annotations, because they are too generic [13, 24, 25]. The total number of obtained GO terms is shown in Table 3.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Then we selected the GO terms with at least 20 proteins in C np , obtaining the terms summarized in Table 2. The proposed features were also tested in terms of their capability in predicting the protein functions, by selecting GO terms with 20–200 annotations in the later release, in order to have a minimum of information to train a classifier, and to exclude terms with a large number of annotations, because they are too generic [13, 24, 25]. The total number of obtained GO terms is shown in Table 3.…”
Section: Methodsmentioning
confidence: 99%
“…That is, before applying any algorithm to learn negative examples, it is of paramount importance studying which ‘protein representation’ is more informative for the problem itself. In this context, most information sources about the relationships between proteins are naturally represented through protein networks, where each node represents a protein and an edge the relationship between two proteins [13]; additionally, most approaches proposed for AFP are network-based [1420]. Thus, the purpose here is twofold: extracting meaningful protein features from protein networks, and assessing their ability to improve the identification of good negative examples.…”
Section: Introductionmentioning
confidence: 99%
“…Although COSNET has already been validated in [14, 15, 37] to solve AFP, for sake of completeness we report in Table 2 its performances in predicting the GO terms. The generalization abilities of COSNET have been assessed through a 5-fold cross validation (CV), and evaluated in terms of Precision (the proportion of positives correctly predicted) and Recall (the proportion of real positive discovered) combined in the F measure, which is the harmonic mean of precision and recall.…”
Section: Resultsmentioning
confidence: 99%
“…Finally, recent studies have improved the GP performance by integrating multiple data sources, including expression profiles, SNP genotype data, expression quantitative trait loci, and so on [19]: network-based approaches indeed construct a consensus network which includes the specificity of each network, covers more genes and contains more accurate pairwise connections [20].…”
Section: Firstmentioning
confidence: 99%