The Human Proteome Organization (HUPO) launched the Human Proteome Project (HPP) in 2010, creating an international framework for global collaboration, data sharing, quality assurance and enhancing accurate annotation of the genome-encoded proteome. During the subsequent decade, the HPP established collaborations, developed guidelines and metrics, and undertook reanalysis of previously deposited community data, continuously increasing the coverage of the human proteome. On the occasion of the HPP’s tenth anniversary, we here report a 90.4% complete high-stringency human proteome blueprint. This knowledge is essential for discerning molecular processes in health and disease, as we demonstrate by highlighting potential roles the human proteome plays in our understanding, diagnosis and treatment of cancers, cardiovascular and infectious diseases.
The Human Proteome Project (HPP) aims to discover high-stringency data for all proteins encoded by the human genome. Currently, ∼18% of the proteins in the human proteome (the missing proteins) do not have high-stringency evidence (for example, mass spectrometry) confirming their existence, while much additional information is available about many of these missing proteins. Here, we present MissingProteinPedia as a community resource to accelerate the discovery and understanding of these missing proteins.
Background: One of the most significant challenges in colorectal cancer (CRC) management is the use of compliant early stage population-based diagnostic tests as adjuncts to confirmatory colonoscopy. Despite the near curative nature of early clinical stage surgical resection, mortality remains unacceptably high-as the majority of patients diagnosed by faecal haemoglobin followed by colonoscopy occur at latter stages. Additionally, current populationbased screens reliant on fecal occult blood test (FOBT) have low compliance (~ 40%) and tests suffer low sensitivities. Therefore, blood-based diagnostic tests offer survival benefits from their higher compliance (≥ 97%), if they can at least match the sensitivity and specificity of FOBTs. However, discovery of low abundance plasma biomarkers is difficult due to occupancy of a high percentage of proteomic discovery space by many high abundance plasma proteins (e.g., human serum albumin). Methods: A combination of high abundance protein ultradepletion (e.g., MARS-14 and an in-house IgY depletion columns) strategies, extensive peptide fractionation methods (SCX, SAX, High pH and SEC) and SWATH-MS were utilized to uncover protein biomarkers from a cohort of 100 plasma samples (i.e., pools of 20 healthy and 20 stages I-IV CRC plasmas). The differentially expressed proteins were analyzed using ANOVA and pairwise t-tests (p < 0.05; fold-change > 1.5), and further examined with a neural network classification method using in silico augmented 5000 patient datasets. Results: Ultradepletion combined with peptide fractionation allowed for the identification of a total of 513 plasma proteins, 8 of which had not been previously reported in human plasma (based on PeptideAtlas database). SWATH-MS analysis revealed 37 protein biomarker candidates that exhibited differential expression across CRC stages compared to healthy controls. Of those, 7 candidates (CST3, GPX3, CFD, MRC1, COMP, PON1 and ADAMDEC1) were validated using Western blotting and/or ELISA. The neural network classification narrowed down candidate biomarkers to 5 proteins (SAA2, APCS, APOA4, F2 and AMBP) that had maintained accuracy which could discern early (I/II) from late (III/IV) stage CRC. Conclusion: MS-based proteomics in combination with ultradepletion strategies have an immense potential of identifying diagnostic protein biosignature.
Statistically, accurate protein identification is a fundamental cornerstone of proteomics and underpins the understanding and application of this technology across all elements of medicine and biology. Proteomics, as a branch of biochemistry, has in recent years played a pivotal role in extending and developing the science of accurately identifying the biology and interactions of groups of proteins or proteomes. Proteomics has primarily used mass spectrometry (MS)-based techniques for identifying proteins, although other techniques including affinity-based identifications still play significant roles. Here, we outline the basics of MS to understand how data are generated and parameters used to inform computational tools used in protein identification. We then outline a comprehensive analysis of the bioinformatics and computational methodologies used in protein identification in proteomics including discussing the most current communally acceptable metrics to validate any identification.
BackgroundSelective kinase and immune checkpoint inhibitors, and their combinations, have significantly improved the survival of patients with advanced metastatic melanoma. Not all patients will respond to treatment however, and some patients will present with significant toxicities. Hence, the identification of biomarkers is critical for the selection and management of patients receiving treatment. Biomarker discovery often involves proteomic techniques that simultaneously profile multiple proteins but few studies have compared these platforms.MethodsIn this study, we used the multiplex bead-based Eve Technologies Discovery assay and the aptamer-based SomaLogic SOMAscan assay to identify circulating proteins predictive of response to immunotherapy in melanoma patients treated with combination immune checkpoint inhibitors. Expression of four plasma proteins were further validated using the bead-based Millipore Milliplex assay.ResultsBoth the Discovery and the SOMAscan assays detected circulating plasma proteins in immunotherapy-treated melanoma patients. However, these widely used assays showed limited correlation in relative protein quantification, due to differences in specificity and the dynamic range of protein detection. Protein data derived from the Discovery and Milliplex bead-based assays were highly correlated.ConclusionsOur study highlights significant limitations imposed by inconsistent sensitivity and specificity due to differences in the detection antibodies or aptamers of these widespread biomarker discovery approaches. Our findings emphasize the need to improve these technologies for the accurate identification of biomarkers.Electronic supplementary materialThe online version of this article (10.1186/s40364-017-0112-9) contains supplementary material, which is available to authorized users.
BackgroundCurrent methods widely deployed for colorectal cancers (CRC) screening lack the necessary sensitivity and specificity required for population-based early disease detection. Cancer-specific protein biomarkers are thought to be produced either by the tumor itself or other tissues in response to the presence of cancers or associated conditions. Equally, known examples of cancer protein biomarkers (e.g., PSA, CA125, CA19-9, CEA, AFP) are frequently found in plasma at very low concentration (pg/mL-ng/mL). New sensitive and specific assays are therefore urgently required to detect the disease at an early stage when prognosis is good following surgical resection. This study was designed to meet the longstanding unmet clinical need for earlier CRC detection by measuring plasma candidate biomarkers of cancer onset and progression in a clinical stage-specific manner. EDTA plasma samples (1 μL) obtained from 75 patients with Dukes’ staged CRC or unaffected controls (age and sex matched with stringent inclusion/exclusion criteria) were assayed for expression of 92 human proteins employing the Proseek® Multiplex Oncology I proximity extension assay. An identical set of plasma samples were analyzed utilizing the Bio-Plex Pro™ human cytokine 27-plex immunoassay.ResultsSimilar quantitative expression patterns for 13 plasma antigens common to both platforms endorsed the potential efficacy of Proseek as an immune-based multiplex assay for proteomic biomarker research. Proseek found that expression of Carcinoembryonic Antigen (CEA), IL-8 and prolactin are significantly correlated with CRC stage.ConclusionsCEA, IL-8 and prolactin expression were found to identify between control (unaffected), non-malignant (Dukes’ A + B) and malignant (Dukes’ C + D) stages.Electronic supplementary materialThe online version of this article (doi:10.1186/s12014-015-9081-x) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.