Proteomes are characterized by large protein-abundance differences, cell-type- and time-dependent expression patterns and post-translational modifications, all of which carry biological information that is not accessible by genomics or transcriptomics. Here we present a mass-spectrometry-based draft of the human proteome and a public, high-performance, in-memory database for real-time analysis of terabytes of big data, called ProteomicsDB. The information assembled from human tissues, cell lines and body fluids enabled estimation of the size of the protein-coding genome, and identified organ-specific proteins and a large number of translated lincRNAs (long intergenic non-coding RNAs). Analysis of messenger RNA and protein-expression profiles of human tissues revealed conserved control of protein abundance, and integration of drug-sensitivity data enabled the identification of proteins predicting resistance or sensitivity. The proteome profiles also hold considerable promise for analysing the composition and stoichiometry of protein complexes. ProteomicsDB thus enables navigation of proteomes, provides biological insight and fosters the development of proteomic technology.
We describe a chemical proteomics approach to profile the interaction of small molecules with hundreds of endogenously expressed protein kinases and purine-binding proteins. This subproteome is captured by immobilized nonselective kinase inhibitors (kinobeads), and the bound proteins are quantified in parallel by mass spectrometry using isobaric tags for relative and absolute quantification (iTRAQ). By measuring the competition with the affinity matrix, we assess the binding of drugs to their targets in cell lysates and in cells. By mapping drug-induced changes in the phosphorylation state of the captured proteome, we also analyze signaling pathways downstream of target kinases. Quantitative profiling of the drugs imatinib (Gleevec), dasatinib (Sprycel) and bosutinib in K562 cells confirms known targets including ABL and SRC family kinases and identifies the receptor tyrosine kinase DDR1 and the oxidoreductase NQO2 as novel targets of imatinib. The data suggest that our approach is a valuable tool for drug discovery.
The development of selective histone deacetylase (HDAC) inhibitors with anti-cancer and anti-inflammatory properties remains challenging in large part owing to the difficulty of probing the interaction of small molecules with megadalton protein complexes. A combination of affinity capture and quantitative mass spectrometry revealed the selectivity with which 16 HDAC inhibitors target multiple HDAC complexes scaffolded by ELM-SANT domain subunits, including a novel mitotic deacetylase complex (MiDAC). Inhibitors clustered according to their target profiles with stronger binding of aminobenzamides to the HDAC NCoR complex than to the HDAC Sin3 complex. We identified several non-HDAC targets for hydroxamate inhibitors. HDAC inhibitors with distinct profiles have correspondingly different effects on downstream targets. We also identified the anti-inflammatory drug bufexamac as a class IIb (HDAC6, HDAC10) HDAC inhibitor. Our approach enables the discovery of novel targets and inhibitors and suggests that the selectivity of HDAC inhibitors should be evaluated in the context of HDAC complexes and not purified catalytic subunits.
The direct detection of drug-protein interactions in living cells is a major challenge in drug discovery research. Recently, we introduced an approach termed thermal proteome profiling (TPP), which enables the monitoring of changes in protein thermal stability across the proteome using quantitative mass spectrometry. We determined the intracellular thermal profiles for up to 7,000 proteins, and by comparing profiles derived from cultured mammalian cells in the presence or absence of a drug we showed that it was possible to identify direct and indirect targets of drugs in living cells in an unbiased manner. Here we demonstrate the complete workflow using the histone deacetylase inhibitor panobinostat. The key to this approach is the use of isobaric tandem mass tag 10-plex (TMT10) reagents to label digested protein samples corresponding to each temperature point in the melting curve so that the samples can be analyzed by multiplexed quantitative mass spectrometry. Important steps in the bioinformatic analysis include data normalization, melting curve fitting and statistical significance determination of compound concentration-dependent changes in protein stability. All analysis tools are made freely available as R and Python packages. The workflow can be completed in 2 weeks.
Plants are indispensable for life on earth and represent organisms of extreme biological diversity with unique molecular capabilities 1. Here, we present a quantitative atlas of the transcriptomes, proteomes and phosphoproteomes of 30 tissues of the model plant Arabidopsis thaliana. It provides initial answers to how many genes exist as proteins (>18,000), where they are expressed, in which approximate quantities (>6 orders of magnitude dynamic range) and to what extent they are phosphorylated (>43,000 sites). We present examples for how the data may be used, for instance, to discover proteins translated from short open reading frames, to uncover sequence motifs involved in protein expression regulation, to identify tissue-specific protein complexes or phosphorylation-mediated signaling events to name a few. Interactive access to this unique resource for the plant community is provided via ProteomicsDB and ATHENA which include powerful bioinformatics tools to explore and characterize Arabidopsis proteins, their modifications and interplay. Main The plant model organism Arabidopsis thaliana (AT) has revolutionized our understanding of plant biology and influenced many other areas of the life sciences 1. Knowledge derived from Arabidopsis has also provided mechanistic understanding of important agronomic traits in crop species 2. The Arabidopsis genome was sequenced 20 years ago and hundreds of natural variants have since been analyzed at the genome and epigenome level 3,4. In contrast, the Arabidopsis proteome as the main executer of most biological processes is far less comprehensively characterized. To address this gap, we used state-of-the-art mass spectrometry and RNA sequencing (RNA-seq) to provide the first integrated proteomic, phosphoproteomic and transcriptomic atlas of Arabidopsis. Illustrated by selected examples, we show how this rich molecular resource can be used to explore the function of single proteins or entire pathways across multiple omics levels. Multi-omics atlas of Arabidopsis We generated an expression atlas covering, on average, 17,603 ± 1,317 transcripts, 14,430 ± 911 proteins and 14,689 ± 2,509 phosphorylation sites (p-sites) per tissue, using a reproducible biochemical and analytical approach (Fig. 1a,b; Extended Data Fig. 1a-c; Supplementary Data 1,2). In total, the protein expression data covers 18,210 of the 27,655 protein-coding genes (66%) annotated in Araport11 5. This is a substantial increase compared to the percentage of genes with protein level evidence reported in UniProt (27%) 6 and more than double the number of proteins identified in an earlier tissue proteome analysis 7 (Fig. 1c, Extended Data Fig. 1d-f). In addition, we report tissue-resolved quantitative evidence for a total of 43,903 p-sites making this study the most comprehensive single Arabidopsis phosphoproteome published to date (Fig. 1c). 47% of the expressed proteome was found to be phosphorylated in at least one instance, confirming earlier analyses of individual
Large scale phosphorylation analysis is more and more getting into focus of proteomic research. Although it is now possible to identify thousands of phosphorylated peptides in a biological system, confident site localization remains challenging. Here we validate the Mascot Delta Score (MD-score) as a simple method that achieves similar sensitivity and specificity for phosphosite localization as the published Ascore, which is mainly used in conjunction with Sequest. The MD-score was evaluated using liquid chromatography-tandem MS data of 180 individually synthesized phosphopeptides with precisely known phosphorylation sites. We tested the MD-score for a wide range of commonly available fragmentation methods and found it to be applicable throughout with high statistical significance. However, the different fragmentation techniques differ strongly in their ability to localize phosphorylation sites. At 1% false localization rate, the highest number of correctly assigned phosphopeptides was achieved by higher energy collision induced dissociation in combination with an Orbitrap mass analyzer followed very closely by low resolution ion trap spectra obtained after electron transfer dissociation. Both these methods are significantly better than low resolution spectra acquired after collision induced dissociation and multi stage activation. Score thresholds determined from simple calibration functions for each fragmentation method were stable over replicate analyses of the phosphopeptide set. The MD-score outperforms the Ascore for tyrosine phosphorylated peptides and we further show that the ability to call sites correctly increases with increasing distance of two candidate sites within a peptide sequence. The MD-score does not require complex computational steps which makes it attractive in terms of practical utility. We provide all mass spectra and the synthetic peptides to the community so that the development of present and future localization software can be benchmarked and any laboratory can determine MD-scores and localization probabilities for their individual analytical set up.
A better understanding of proteostasis in health and disease requires robust methods to determine protein half-lives. Here we improve the precision and accuracy of peptide ion intensity-based quantification, enabling more accurate protein turnover determination in non-dividing cells by dynamic SILAC-based proteomics. This approach allows exact determination of protein half-lives ranging from 10 to >1000 h. We identified 4000–6000 proteins in several non-dividing cell types, corresponding to 9699 unique protein identifications over the entire data set. We observed similar protein half-lives in B-cells, natural killer cells and monocytes, whereas hepatocytes and mouse embryonic neurons show substantial differences. Our data set extends and statistically validates the previous observation that subunits of protein complexes tend to have coherent turnover. Moreover, analysis of different proteasome and nuclear pore complex assemblies suggests that their turnover rate is architecture dependent. These results illustrate that our approach allows investigating protein turnover and its implications in various cell types.
Isobaric mass tagging (e.g., TMT and iTRAQ) is a precise and sensitive multiplexed peptide/protein quantification technique in mass spectrometry. However, accurate quantification of complex proteomic samples is impaired by cofragmentation of peptides, leading to systematic underestimation of quantitative ratios. Label-free quantification strategies do not suffer from such an accuracy bias but cannot be multiplexed and are less precise. Here, we compared protein quantification results obtained with these methods for a chemoproteomic competition binding experiment and evaluated the utility of measures of spectrum purity in survey spectra for estimating the impact of cofragmentation on measured TMT-ratios. While applying stringent interference filters enables substantially more accurate TMT quantification, this came at the expense of 30%-60% fewer proteins quantified. We devised an algorithm that corrects experimental TMT ratios on the basis of determined peptide interference levels. The quantification accuracy achieved with this correction was comparable to that obtained with stringent spectrum filters but limited the loss in coverage to <10%. The generic applicability of the fold change correction algorithm was further demonstrated by spiking of chemoproteomics samples into excess amounts of E. coli tryptic digests.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.