2014
DOI: 10.1093/bioinformatics/btu118
|View full text |Cite
|
Sign up to set email alerts
|

Protein fold recognition using geometric kernel data fusion

Abstract: Motivation: Various approaches based on features extracted from protein sequences and often machine learning methods have been used in the prediction of protein folds. Finding an efficient technique for integrating these different protein features has received increasing attention. In particular, kernel methods are an interesting class of techniques for integrating heterogeneous data. Various methods have been proposed to fuse multiple kernels. Most techniques for multiple kernel learning focus on learning a c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(11 citation statements)
references
References 45 publications
0
11
0
Order By: Relevance
“…Each fold has at least 11 training examples. The accuracy of our classifiers are shown in Table 1 along with the results reported by several other published methods121516171819. Since some of the sequences in these benchmarks are similar to the templates that make up our feature space (that is, similar to the sequences of the domains from which the templates are derived), we ran our classifiers both with and without filtering of these template-similar sequences (Table 1; “filtered” versions correspond to benchmarks where sequences with >25% pairwise identity with any template were removed; see Methods).…”
Section: Resultsmentioning
confidence: 98%
“…Each fold has at least 11 training examples. The accuracy of our classifiers are shown in Table 1 along with the results reported by several other published methods121516171819. Since some of the sequences in these benchmarks are similar to the templates that make up our feature space (that is, similar to the sequences of the domains from which the templates are derived), we ran our classifiers both with and without filtering of these template-similar sequences (Table 1; “filtered” versions correspond to benchmarks where sequences with >25% pairwise identity with any template were removed; see Methods).…”
Section: Resultsmentioning
confidence: 98%
“…From our previous studies, the kernel in kernel methods mainly acts as an interface between the input data module and the learning algorithms [13]. Selecting the best kernel, which is rich in the classification information, is the key step for the application to kernel methods [45]. However, it is very difficult to select the best kernels in practice, because the selection of the kernel largely depends on the encoding of our prior knowledge about the data and the types of patterns we can expect to identify.…”
Section: Discussionmentioning
confidence: 99%
“…Hess et al (Hess et al, 2009), for example, used an ensemble method combining three different Bayesian approaches that integrated sequence features, coexpression, protein-protein interactions, binding sites, and subcellular localization to improve protein function prediction. Other studies have combined tertiary structure with primary structural homology and secondary structure biochemistry (Wang et al, 2014); sequence data with gene expression, protein-protein interactions, and evolutionary conservation (Cozzetto et al, 2013); sequence with interaction profiles and domain co-occurrence (Wang et al, 2013); and sequence features with active site motifs and structural alignment for determination of protein fold activities (Zakeri et al, 2014). Purely technical issues have impeded the application of these methods to microbial genomes and metagenomes.…”
Section: Computational Approaches To Hypothesize Function For Humanmentioning
confidence: 99%