An Integrated Machine Learning System to Computationally Screen Protein Databases for Protein Binding Peptide Ligands

Zhang, Ling; Shao, Chen; Zheng, Dexian; Gao, Youhe

doi:10.1074/mcp.m500346-mcp200

Cited by 17 publications

(16 citation statements)

References 40 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We investigated the performance of SVM using a balanced training data set. Following the procedure of Zhang et al, 13 we retrained an SVM classifier using a balanced data set of 41 binders and 41 randomly selected non-binders. The prediction accuracies for binders and non-binders are 90.2% and 85.4%, respectively.…”

Section: Classification Models Using Svmmentioning

confidence: 99%

“…Recently, machine learning algorithms, such as artificial neural network and support vector machine (SVM), were introduced to predict the SH3 domain binding peptides based on contact information. 12,13 Training these classifiers usually requires data for numerous SH3 domains because the number of possible combinations of contacts is huge. On the other hand, these methods are computationally efficient and can be used for quick proteome screening.…”

Section: Introductionmentioning

confidence: 99%

“…[10][11][12][13][14][15][16][17][18][19][20] For example, the SH3-SPOT method builds a positionspecific contact frequency matrix based on proteinpeptide contacts in a number of crystal structures of SH3/peptide and SH3/protein complexes. 11 The matrix was then used to calculate the probability of a peptide binding to a specific SH3 domain.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Characterization of Domain–Peptide Interaction Interface: A Case Study on the Amphiphysin-1 SH3 Domain

Hou

Zhang

Case

et al. 2008

Journal of Molecular Biology

189

156

View full text Add to dashboard Cite

Section: Classification Models Using Svmmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Characterization of Domain–Peptide Interaction Interface: A Case Study on the Amphiphysin-1 SH3 Domain

Hou

Zhang

Case

et al. 2008

Journal of Molecular Biology

189

156

View full text Add to dashboard Cite

“…Comparison with SH3-hunter-Among all the methods for predicting the binding specificity of SH3 domains (17)(18)(19) iSPOT and its improved version of SH3-hunter are publicly available (17,18). Sparks et al (10) studied interactions between 20 peptides and 13 SH3 domains among which Src, Yes, Abl, and Grb2 were modeled in our study.…”

Section: Comparison With Other Methodsmentioning

confidence: 99%

Characterization of Domain-Peptide Interaction Interface

Hou

Zhang

et al. 2009

Molecular & Cellular Proteomics

101

View full text Add to dashboard Cite

Extensive efforts have been devoted to determining the binding specificity of Src homology 3 (SH3) domains usually in a case-by-case manner. A generic structure-based model is necessary to decipher the protein recognition code of the entire domain family. In this study, we have developed a general framework that combines molecular modeling and a machine learning algorithm to capture the energetic characteristics of the domain-peptide interactions and predict the binding specificity of the SH3 domain family. Our model is not trained for individual SH3 domains; rather it is a generic model for the entire domain family. Our model not only achieved satisfactory prediction accuracy but also provided structural insights into which residues are important for the binding specificity 1 domain (4) that recognizes proline-rich peptides with a core motif of PXXP (P is a proline and X is any amino acid) (5, 6). Peptides can bind to SH3 domains in two opposite orientations and are referred as class I and II peptides, which often contain ϩXXPXXP and PXXPXϩ (where X refers to any residue and ϩ refers to a positively charged residue) motifs, respectively. The binding specificity of an SH3 domain is determined by the amino acids in the flanking regions of the core motif, which has been investigated extensively for individual domains. However, a universal model was lacking to decipher the protein recognition code of the SH3 domain family.A generic model for the entire domain family needs to 1) provide a general framework to characterize the domainpeptide interaction and 2) reliably predict the binding specificity of each member in the domain family. Previous experimental and computational studies can only satisfy one of these requirements. For example, peptide library and peptide or protein array technologies are commonly used to determine the peptide motifs recognized by a domain, often represented as a position-specific scoring matrix (7-13). These approaches have limited coverage of the peptide space because the peptides tested in the experiments usually only represent a small portion of all the possible peptides of a given length. In addition, the prediction power of a sequence motif on interacting partners of a domain is often unsatisfactory. Along that line, a survey of protein-protein interaction interfaces (14) also suggested that a sophisticated model, rather than a set of well defined rules, is needed to decipher the specificity of protein recognition.On the other hand, high throughput technologies, such as yeast two-hybrid assay and complex purification followed by mass spectrometry, have been used to identify protein-protein interactions. However, these methods often miss the weak and transient domain-peptide interactions (15). Various computational methods have also been developed to predict the interacting partners of modular domains (16 -20). For example, the SH3-SPOT method builds a position-specific contact frequency matrix based on the protein-peptide contacts in a number of crystal structures of SH3-peptide and ...

show abstract

“…Several computational methods related to domain-peptide interaction are available, but only on limited domains, such as SH2 (44), SH3 (45)(46)(47)(48)(49), and PDZ (50 -55). For all of these domains, enough interaction data have already been obtained to train a reasonably good machine learning model.…”

mentioning

confidence: 99%

Identification of Methyllysine Peptides Binding to Chromobox Protein Homolog 6 Chromodomain in the Human Proteome

Stein

Wang

et al. 2013

Molecular & Cellular Proteomics

View full text Add to dashboard Cite

Methylation is one of the important post-translational modifications that play critical roles in regulating protein functions. Proteomic identification of this post-translational modification and understanding how it affects protein activity remain great challenges. We tackled this problem from the aspect of methylation mediating protein-protein interaction. Using the chromodomain of human chromobox protein homolog 6 as a model system, we developed a systematic approach that integrates structure modeling, bioinformatics analysis, and peptide microarray experiments to identify lysine residues that are methylated and recognized by the chromodomain in the human proteome. Given the important role of chromobox protein homolog 6 as a reader of histone modifications, it was interesting to find that the majority of its interacting partners identified via this approach function in chromatin remodeling and transcriptional regulation. Our study not only illustrates a novel angle for identifying methyllysines on a proteomewide scale and elucidating their potential roles in regulating protein function, but also suggests possible strategies for engineering the chromodomain-peptide interface to enhance the recognition of and manipulate the signal transduction mediated by such interactions. Molecular

show abstract

An Integrated Machine Learning System to Computationally Screen Protein Databases for Protein Binding Peptide Ligands

Cited by 17 publications

References 40 publications

Characterization of Domain–Peptide Interaction Interface: A Case Study on the Amphiphysin-1 SH3 Domain

Characterization of Domain–Peptide Interaction Interface: A Case Study on the Amphiphysin-1 SH3 Domain

Characterization of Domain-Peptide Interaction Interface

Identification of Methyllysine Peptides Binding to Chromobox Protein Homolog 6 Chromodomain in the Human Proteome

Contact Info

Product

Resources

About