Protein fold recognition using geometric kernel data fusion

Zakeri, Pooya; Jeuris, Ben; Vandebril, Raf; Moreau, Yves

doi:10.1093/bioinformatics/btu118

Cited by 27 publications

(11 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Each fold has at least 11 training examples. The accuracy of our classifiers are shown in Table 1 along with the results reported by several other published methods121516171819. Since some of the sequences in these benchmarks are similar to the templates that make up our feature space (that is, similar to the sequences of the domains from which the templates are derived), we ran our classifiers both with and without filtering of these template-similar sequences (Table 1; “filtered” versions correspond to benchmarks where sequences with >25% pairwise identity with any template were removed; see Methods).…”

Section: Resultsmentioning

confidence: 98%

Complete fold annotation of the human proteome using a novel structural feature space

Middleton

Illuminati

Kim

2017

Sci Rep

View full text Add to dashboard Cite

Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.

show abstract

Section: Resultsmentioning

confidence: 98%

Complete fold annotation of the human proteome using a novel structural feature space

Middleton

Illuminati

Kim

2017

Sci Rep

View full text Add to dashboard Cite

show abstract

“…From our previous studies, the kernel in kernel methods mainly acts as an interface between the input data module and the learning algorithms [13]. Selecting the best kernel, which is rich in the classification information, is the key step for the application to kernel methods [45]. However, it is very difficult to select the best kernels in practice, because the selection of the kernel largely depends on the encoding of our prior knowledge about the data and the types of patterns we can expect to identify.…”

Section: Discussionmentioning

confidence: 99%

In silico toxicity prediction of chemicals from EPA toxicity database by kernel fusion-based support vector machines

Cao

Dong

Wang

et al. 2015

Chemometrics and Intelligent Laboratory Systems

View full text Add to dashboard Cite

“…Hess et al (Hess et al, 2009), for example, used an ensemble method combining three different Bayesian approaches that integrated sequence features, coexpression, protein-protein interactions, binding sites, and subcellular localization to improve protein function prediction. Other studies have combined tertiary structure with primary structural homology and secondary structure biochemistry (Wang et al, 2014); sequence data with gene expression, protein-protein interactions, and evolutionary conservation (Cozzetto et al, 2013); sequence with interaction profiles and domain co-occurrence (Wang et al, 2013); and sequence features with active site motifs and structural alignment for determination of protein fold activities (Zakeri et al, 2014). Purely technical issues have impeded the application of these methods to microbial genomes and metagenomes.…”

Section: Computational Approaches To Hypothesize Function For Humanmentioning

confidence: 99%

Determining Microbial Products and Identifying Molecular Targets in the Human Microbiome

et al. 2014

View full text Add to dashboard Cite

Human-associated microbes are the source of many bioactive microbial products (proteins and metabolites) that play key functions both in human host pathways and in microbe-microbe interactions. Culture-independent studies now provide an accelerated means of exploring novel bioactives in the human microbiome; however, intriguingly, a substantial fraction of the microbial metagenome cannot be mapped to annotated genes or isolate genomes and is thus of unknown function. Meta'omic approaches, including metagenomic sequencing, metatranscriptomics, metabolomics, and integration of multiple assay types, represent an opportunity to efficiently explore this large pool of potential therapeutics. In combination with appropriate follow-up validation, high-throughput culture-independent assays can be combined with computational approaches to identify and characterize novel and biologically interesting microbial products. Here, we briefly review the state of microbial product identification and characterization and discuss possible next steps to catalog and leverage the large uncharted fraction of the microbial metagenome.

show abstract

Protein fold recognition using geometric kernel data fusion

Cited by 27 publications

References 45 publications

Complete fold annotation of the human proteome using a novel structural feature space

Complete fold annotation of the human proteome using a novel structural feature space

In silico toxicity prediction of chemicals from EPA toxicity database by kernel fusion-based support vector machines

Determining Microbial Products and Identifying Molecular Targets in the Human Microbiome

Contact Info

Product

Resources

About