Summary
Members of the SH2 domain family modulate signal transduction by binding to short peptides containing phosphorylated tyrosines. Each domain displays a distinct preference for the sequence context of the phosphorylated residue. We have developed a new high-density peptide chip technology that allows probing the affinity of most SH2 domains for a large fraction of the entire complement of tyrosine phosphopeptides in the human proteome. Using this technique we have experimentally identified thousands of putative SH2- peptide interactions for more than 70 different SH2 domains. By integrating this rich data set with orthogonal context-specific information, we have assembled an SH2 mediated probabilistic interaction network, which we make available as a community resource in the PepSpotDB database. A new predicted dynamic interaction between the SH2 domains of the tyrosine phosphatase SHP2 and the phosphorylated tyrosine in the ERK activation loop was validated by experiments in living cells.
There is growing evidence that tyrosine phosphatases display an intrinsic enzymatic preference for the sequence context flanking the target phosphotyrosines. On the other hand, substrate selection in vivo is decisively guided by the enzyme-substrate connectivity in the protein interaction network. We describe here a system wide strategy to infer physiological substrates of protein-tyrosine phosphatases. Here we integrate, by a Bayesian model, proteome wide evidence about in vitro substrate preference, as determined by a novel high-density peptide chip technology, and “closeness” in the protein interaction network. This allows to rank candidate substrates of the human PTP1B phosphatase. Ultimately a variety of in vitro and in vivo approaches were used to verify the prediction that the tyrosine phosphorylation levels of five high-ranking substrates, PLC-γ1, Gab1, SHP2, EGFR, and SHP1, are indeed specifically modulated by PTP1B. In addition, we demonstrate that the PTP1B-mediated dephosphorylation of Gab1 negatively affects its EGF-induced association with the phosphatase SHP2. The dissociation of this signaling complex is accompanied by a decrease of ERK MAP kinase phosphorylation and activation.
How is the yeast proteome wired? This important question, central in yeast systems biology, remains unanswered in spite of the abundance of protein interaction data from high-throughput experiments. Unfortunately, these large-scale studies show striking discrepancies in their results and coverage such that biologists scrutinizing the "interactome" are often confounded by a mix of established physical interactions, functional associations, and experimental artifacts. This stimulated early attempts to integrate the available information and produce a list of protein interactions ranked according to an estimated functional reliability. The recent publication of the results of two large protein interaction experiments and the completion of a comprehensive literature curation effort has more than doubled the available information on the wiring of the yeast proteome. This motivates a fresh approach to the compilation of a yeast interactome based purely on evidence of physical interaction. We present a procedure exploiting both heuristic and probabilistic strategies to draft the yeast interactome taking advantage of various heterogeneous data sources: application of tandem affinity purification coupled to MS (TAP-MS), large-scale yeast two-hybrid studies, and results of small-scale experiments stored in dedicated databases. The end result is WI-PHI, a weighted network encompassing a large majority of yeast proteins.
Background: In the absence of consolidated pipelines to archive biological data electronically, information dispersed in the literature must be captured by manual annotation. Unfortunately, manual annotation is time consuming and the coverage of published interaction data is therefore far from complete. The use of text-mining tools to identify relevant publications and to assist in the initial information extraction could help to improve the efficiency of the curation process and, as a consequence, the database coverage of data available in the literature. The 2006 BioCreative competition was aimed at evaluating text-mining procedures in comparison with manual annotation of protein-protein interactions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.