Proteomes are characterized by large protein-abundance differences, cell-type- and time-dependent expression patterns and post-translational modifications, all of which carry biological information that is not accessible by genomics or transcriptomics. Here we present a mass-spectrometry-based draft of the human proteome and a public, high-performance, in-memory database for real-time analysis of terabytes of big data, called ProteomicsDB. The information assembled from human tissues, cell lines and body fluids enabled estimation of the size of the protein-coding genome, and identified organ-specific proteins and a large number of translated lincRNAs (long intergenic non-coding RNAs). Analysis of messenger RNA and protein-expression profiles of human tissues revealed conserved control of protein abundance, and integration of drug-sensitivity data enabled the identification of proteins predicting resistance or sensitivity. The proteome profiles also hold considerable promise for analysing the composition and stoichiometry of protein complexes. ProteomicsDB thus enables navigation of proteomes, provides biological insight and fosters the development of proteomic technology.
Genome‐, transcriptome‐ and proteome‐wide measurements provide insights into how biological systems are regulated. However, fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we generated a quantitative proteome and transcriptome abundance atlas of 29 paired healthy human tissues from the Human Protein Atlas project representing human genes by 18,072 transcripts and 13,640 proteins including 37 without prior protein‐level evidence. The analysis revealed that hundreds of proteins, particularly in testis, could not be detected even for highly expressed mRNA s, that few proteins show tissue‐specific expression, that strong differences between mRNA and protein quantities within and across tissues exist and that protein expression is often more stable across tissues than that of transcripts. Only 238 of 9,848 amino acid variants found by exome sequencing could be confidently detected at the protein level showing that proteogenomics remains challenging, needs better computational methods and requires rigorous validation. Many uses of this resource can be envisaged including the study of gene/protein expression regulation and biomarker specificity evaluation.
The functional relevance of pre-existing cross-immunity to SARS-CoV-2 is a subject of intense debate. Here, we show that human endemic coronavirus (HCoV)-reactive and SARS-CoV-2-cross-reactive CD4+ T cells are ubiquitous but decrease with age. We identified a universal immunodominant coronavirus-specific spike peptide (S816-830) and demonstrate that pre-existing spike- and S816-830-reactive T cells were recruited into immune responses to SARS-CoV-2 infection and their frequency correlated with anti-SARS-CoV-2-S1-IgG antibodies. Spike-cross-reactive T cells were also activated after primary BNT162b2 COVID-19 mRNA vaccination displaying kinetics similar to secondary immune responses. Our results highlight the functional contribution of pre-existing spike-cross-reactive T cells in SARS-CoV-2 infection and vaccination. Cross-reactive immunity may account for the unexpectedly rapid induction of immunity following primary SARS-CoV-2 immunization and the high rate of asymptomatic/mild COVID-19 disease courses.
Genome-, transcriptome- and proteome-wide measurements provide valuable insights into how biological systems are regulated. However, even fundamental aspects relating to which human proteins exist, where they are expressed and in which quantities are not fully understood. Therefore, we have generated a systematic, quantitative and deep proteome and transcriptome abundance atlas from 29 paired healthy human tissues from the Human Protein Atlas Project and representing human genes by 17,615 transcripts and 13,664 proteins. The analysis revealed that few proteins show truly tissue-specific expression, that vast differences between mRNA and protein quantities within and across tissues exist and that the expression levels of proteins are often more stable across tissues than those of transcripts. In addition, only ~2% of all exome and ~7% of all mRNA variants could be confidently detected at the protein level showing that proteogenomics remains challenging, requires rigorous validation using synthetic peptides and needs more sophisticated computational methods. Many uses of this resource can be envisaged ranging from the study of gene/protein expression regulation to protein biomarker specificity evaluation to name a few.
The ProteomeTools project builds molecular and digital tools from the human proteome to facilitate biomedical and life science research. Here, we report the generation and multimodal LC-MS/MS analysis of >330,000 synthetic tryptic peptides representing essentially all canonical human gene products and exemplify the utility of this data. The resource will be extended to >1 million peptides and all data will be shared with the community via ProteomicsDB and proteomeXchange.
SummaryCitrullination is a post-translational modification of arginine catalyzed by five peptidylarginine deiminases (PADs) in humans. The loss of a positive charge may cause structural or functional alterations and while the modification has been linked to several diseases including rheumatoid arthritis and cancer, its physiological or pathophysiological roles remain largely unclear. In part this is owing to limitations in available methodology able to robustly enrich, detect and localize the modification. As a result, only few citrullination sites have been identified on human proteins with high confidence. In this study, we mined data from mass spectrometry-based deep proteomic profiling of 30 human tissues to identify citrullination sites on endogenous proteins. Database searching of ~70 million tandem mass spectra yielded ~13,000 candidate spectra which were further triaged by spectrum quality metrics and the detection of the specific neutral loss of isocyanic acid from citrullinated peptides to reduce false positives. Because citrullination is easily confused with deamidation, we synthetized ~2,200 citrullinated and 1,300 deamidated peptides to build a library of reference spectra. This led to the validation of 375 citrullination sites on 209 human proteins. Further analysis showed that >80% of the identified modifications sites were new and for 56% of the proteins, citrullination was detected for the first time. Sequence motif analysis revealed a strong preference for Asp and Gly, residues around the citrullination site. Interestingly, while the modification was detected in 26 human tissues with the highest levels found in brain and lung, citrullination levels did not correlate well with protein expression of the PAD enzymes.Even though the current work represents the largest survey of protein citrullination to date, the modification was mostly detected on high abundant proteins arguing that the development of specific enrichment methods would be required in order to study the full extent of cellular protein citrullination.
As the SARS-CoV-2 pandemic continues to spread, thousands of scientists around the globe have changed research direction to understand better how the virus works and to find out how it may be tackled. The number of manuscripts on preprint servers is soaring and peer-reviewed publications using mass spectrometry-based proteomics are beginning to emerge. To facilitate proteomic research on SARS-CoV-2, this report presents deep-scale proteomes (10,000 proteins; >130,000 peptides) of common cell line models, notably Vero E6, Calu-3, Caco-2, and ACE2-A549 that characterize their protein expression profiles including viral entry factors such as ACE2 or TMPRSS2. Using the 9 kDa protein SRP9 and the breast cancer oncogene BRCA1 as examples, we show how the proteome expression data can be used to refine the annotation of protein-coding regions of the African green monkey and the Vero cell line genomes. Monitoring changes of the proteome upon viral infection revealed widespread expression changes including transcriptional regulators, protease inhibitors, and proteins involved in innate immunity. Based on a library of 98 stable-isotope labeled synthetic peptides representing 11 SARS-CoV-2 proteins, we developed PRM (parallel reaction monitoring) assays for nano-flow and micro-flow LC-MS/MS. We assessed the merits of these PRM assays using supernatants of virus-infected Vero E6 cells and challenged the assays by analyzing two diagnostic cohorts of 24 (+30) SARS-CoV-2 positive and 28 (+9) negative cases. In light of the results obtained and including recent publications or manuscripts on preprint servers, we critically discuss the merits of mass spectrometry-based proteomics for SARS-CoV-2 research and testing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.