A full description of the human proteome relies on the challenging task of detecting mature and changing forms of protein molecules in the body. Large scale proteome analysis1 has routinely involved digesting intact proteins followed by inferred protein identification using mass spectrometry (MS)2. This “bottom up” process affords a high number of identifications (not always unique to a single gene). However, complications arise from incomplete or ambiguous2 characterization of alternative splice forms, diverse modifications (e.g., acetylation and methylation), and endogenous protein cleavages, especially when combinations of these create complex patterns of intact protein isoforms and species3. “Top down” interrogation of whole proteins can overcome these problems for individual proteins4,5, but has not been achieved on a proteome scale due to the lack of intact protein fractionation methods that are well integrated with tandem MS. Here we show, using a new four dimensional (4D) separation system, identification of 1,043 gene products from human cells that are dispersed into >3,000 protein species created by post-translational modification, RNA splicing, and proteolysis. The overall system produced >20-fold increases in both separation power and proteome coverage, enabling the identification of proteins up to 105 kilodaltons and those with up to 11 transmembrane helices. Many previously undetected isoforms of endogenous human proteins were mapped, including changes in multiply-modified species in response to accelerated cellular aging (senescence) induced by DNA damage. Integrated with the latest version of the Swiss-Prot database6, the data provide precise correlations to individual genes and proof-of-concept for large scale interrogation of whole protein molecules. The technology promises to improve the link between proteomics data and complex phenotypes in basic biology and disease research7.
The rise of the “Top Down” method in the field of mass spectrometry-based proteomics has ushered in a new age of promise and challenge for the characterization and identification of proteins. Injecting intact proteins into the mass spectrometer allows for better characterization of post-translational modifications and avoids several of the serious “inference” problems associated with peptide-based proteomics. However, successful implementation of a Top Down approach to endogenous or other biologically relevant samples often requires the use of one or more forms of separation prior to mass spectrometric analysis, which have only begun to mature for whole protein MS. Recent advances in instrumentation have been used in conjunction with new ion fragmentation using photons and electrons that allow for better (and often complete) protein characterization on cases simply not tractable even just a few years ago. Finally, the use of native electrospray mass spectrometry has shown great promise for the identification and characterization of whole protein complexes in the 100 kDa to 1 MDa regime, with prospects for complete compositional analysis for endogenous protein assemblies a viable goal over the coming few years.
Top-down proteomics is emerging as a viable method for the routine identification of hundreds to thousands of proteins. In this work we report the largest top-down study to date, with the identification of 1,220 proteins from the transformed human cell line H1299 at a false discovery rate of 1%. Multiple separation strategies were utilized, including the focused isolation of mitochondria, resulting in significantly improved proteome coverage relative to previous work. In all, 347 mitochondrial proteins were identified, including ϳ50% of the mitochondrial proteome below 30 kDa and over 75% of the subunits constituting the large complexes of oxidative phosphorylation. Three hundred of the identified proteins were found to be integral membrane proteins containing between 1 and 12 transmembrane helices, requiring no specific enrichment or modified LC-MS parameters. Over 5,000 proteoforms were observed, many harboring post-translational modifications, including over a dozen proteins containing lipid anchors (some previously unknown) and many others with phosphorylation and methylation modifications. Comparison between untreated and senescent H1299 cells revealed several changes to the proteome, including the hyperphosphorylation of HMGA2. This work illustrates the burgeoning ability of top-down proteomics to characterize large numbers of intact proteoforms in a highthroughput fashion. Molecular & Cellular Proteomics
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.