A proteoform is a defined form of a protein derived from a given gene with a specific amino acid sequence and localized post‐translational modifications. In top‐down proteomic analyses, proteoforms are identified and quantified through mass spectrometric analysis of intact proteins. Recent technological developments have enabled comprehensive proteoform analyses in complex samples, and an increasing number of laboratories are adopting top‐down proteomic workflows. In this review, some recent advances are outlined and current challenges and future directions for the field are discussed.
The recent discovery of significant hydropersulfide (RSSH) levels in mammalian tissues, fluids and cells has led to numerous questions regarding their possible physiological function. Cysteine hydropersulfides have been found in free cysteine, small molecule peptides as well as in proteins. Based on their chemical properties and likely cellular conditions associated with their biosynthesis, it has been proposed that they can serve a protective function. That is, hydropersulfide formation on critical thiols may protect them from irreversible oxidative or electrophilic inactivation. As a prelude to understanding the possible roles and functions of hydropersulfides in biological systems, this study utilizes primarily chemical experiments to delineate the possible mechanistic chemistry associated with cellular protection. Thus, the ability of hydropersulfides to protect against irreversible electrophilic and oxidative modification was examined. The results herein indicate that hydropersulfides are very reactive towards oxidants and electrophiles and are modified readily. However, reduction of these oxidized/modified species is facile generating the corresponding thiol, consistent with the idea that hydropersulfides can serve a protective function for thiol proteins.
The Consortium for Top-Down Proteomics (www.topdownproteomics.org) launched the present study to assess the current state of top-down mass spectrometry (TD MS) and middle-down mass spectrometry (MD MS) for characterizing monoclonal antibody (mAb) primary structures, including their modifications. To meet the needs of the rapidly growing therapeutic antibody market, it is important to develop analytical strategies to characterize the heterogeneity of a therapeutic product's primary structure accurately and reproducibly. The major objective of the present study is to determine whether current TD/MD MS technologies and protocols can add value to the more commonly employed bottom-up (BU) approaches with regard to confirming protein integrity, sequencing variable domains, avoiding artifacts, and revealing modifications and their locations. We also aim to gather information
The rapid and accurate quantification of peptides is a critical element of modern proteomics that has become increasingly challenging as proteomic data sets grow in size and complexity. We present here FlashLFQ, a computer program for high-speed label-free quantification of peptides following a search of bottom-up mass spectrometry data. FlashLFQ is approximately an order of magnitude faster than established label-free quantification methods. The increased speed makes it practical to base quantification upon all of the charge states for a given peptide rather than solely upon the charge state that was selected for MS2 fragmentation. This increases the number of quantified peptides, improves replicate-to-replicate reproducibility, and increases quantitative accuracy. We integrated FlashLFQ into the graphical user interface of the MetaMorpheus search software, allowing it to work together with the global post-translational modification discovery (G-PTM-D) engine to accurately quantify modified peptides. FlashLFQ is also available as a NuGet package, facilitating its integration into other software, and as a standalone command line software program for the quantification of search results from other programs (e.g., MaxQuant).
Recent reports indicate the ubiquitous prevalence of hydropersulfides (RSSH) in mammalian systems. The biological utility of these and related species is currently a matter of significant speculation. The function, lifetime and fate of hydropersulfides will be assuredly based on their chemical properties and reactivity. Thus, to serve as the basis for further mechanistic studies regarding hydropersulfide biology, some of the basic chemical properties/reactivity of hydropersulfides were studied. The nucleophilicity, electrophilicity and redox properties of hydropersulfides were examined under biological conditions. These studies indicate that hydropersulfides can be nucleophilic or electrophilic, depending on the pH (i.e. the protonation state) and can act as good one- and two-electron reductants. These diverse chemical properties in a single species make hydropersulfides chemically distinct from other, well-known sulfur containing biological species, giving them unique and potentially important biological function.
Peptides detected by tandem mass spectrometry (MS/MS) in bottom-up proteomics serve as proxies for the proteins expressed in the sample. Protein inference is a process routinely applied to these peptides to generate a plausible list of candidate protein identifications. The use of multiple proteases for parallel protein digestions expands sequence coverage, provides additional peptide identifications, and increases the probability of identifying peptides that are unique to a single protein, which are all valuable for protein inference. We have developed and implemented a multiprotease protein inference algorithm in MetaMorpheus, a bottom-up search software program, which incorporates the calculation of protease-specific q-values and preserves the association of peptide sequences and their protease of origin. This integrated multi-protease protein inference algorithm provides more accurate results than either the aggregation of results from the separate analysis of the peptide identifications produced by each protease (separate approach) in MetaMorpheus, or results that are obtained using Fido, ProteinProphet, or DTASelect2. MetaMorpheus' integrated multi-protease data analysis decreases the ambiguity of the protein group list, reduces the frequency of erroneous identifications, and increases the number of post-translational modifications identified, while combining multi-protease search and protein inference into a single software program.
Protein chemical cross-linking combined with mass spectrometry has become an important technique for the analysis of protein structure and protein-protein interactions. A variety of cross-linkers are well developed, but reliable, rapid, and user-friendly tools for large-scale analysis of cross-linked proteins are still in need. Here we report MetaMorpheusXL, a new search module within the MetaMorpheus software suite that identifies both MS-cleavable and noncleavable cross-linked peptides in MS data. MetaMorpheusXL identifies MS-cleavable cross-linked peptides with an ion-indexing algorithm, which enables an efficient large database search. The identification does not require the presence of signature fragment ions, an advantage compared with similar programs such as XlinkX. One complication associated with the need for signature ions from cleavable cross-linkers such as DSSO (disuccinimidyl sulfoxide) is the requirement for multiple fragmentation types and energy combinations, which is not necessary for MetaMorpheusXL. The ability to perform proteome-wide analysis is another advantage of MetaMorpheusXL compared with programs such as MeroX and DXMSMS. MetaMorpheusXL is also faster than other currently available MS-cleavable cross-link search software programs. It is imbedded in MetaMorpheus, an open-source and freely available software suite that provides a reliable, fast, user-friendly graphical user interface that is readily accessible to researchers.
Background The detection of physiologically relevant protein isoforms encoded by the human genome is critical to biomedicine. Mass spectrometry (MS)-based proteomics is the preeminent method for protein detection, but isoform-resolved proteomic analysis relies on accurate reference databases that match the sample; neither a subset nor a superset database is ideal. Long-read RNA sequencing (e.g., PacBio or Oxford Nanopore) provides full-length transcripts which can be used to predict full-length protein isoforms. Results We describe here a long-read proteogenomics approach for integrating sample-matched long-read RNA-seq and MS-based proteomics data to enhance isoform characterization. We introduce a classification scheme for protein isoforms, discover novel protein isoforms, and present the first protein inference algorithm for the direct incorporation of long-read transcriptome data to enable detection of protein isoforms previously intractable to MS-based detection. We have released an open-source Nextflow pipeline that integrates long-read sequencing in a proteomic workflow for isoform-resolved analysis. Conclusions Our work suggests that the incorporation of long-read sequencing and proteomic data can facilitate improved characterization of human protein isoform diversity. Our first-generation pipeline provides a strong foundation for future development of long-read proteogenomics and its adoption for both basic and translational research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.