Capillary zone electrophoresis-electrospray ionization-tandem mass spectrometry (CZE-ESI-MS/MS) has attracted attention recently for top-down proteomics because it can achieve highly efficient separation and very sensitive detection of proteins. However, separation window and sample loading volume of CZE need to be boosted for a better proteome coverage using CZE-MS/MS. Here, we present an improved CZE-MS/MS system that achieved a 180-min separation window and a 2-μL sample loading volume for top-down characterization of protein mixtures. The system obtained highly efficient separation of proteins with nearly one million theoretical plates for myoglobin and enabled baseline separation of three different proteoforms of myoglobin. The CZE-MS/MS system identified 797±21 proteoforms and 258±7 proteins (n=2) from an Escherichia coli (E. coli) proteome sample in a single run with only 250 ng of proteins injected. The system still identified 449±40 proteoforms and 173±6 proteins (n=2) from the E. coli sample when only 25 ng of proteins were injected per run. Single-shot CZE-MS/MS analyses of zebrafish brain cerebellum (Cb) and optic tectum (Teo) regions identified 1 730±196 proteoforms (n=3) and 2 024±255 proteoforms (n=3), respectively, with only 500-ng proteins loaded per run. Label-free quantitative top-down proteomics of zebrafish brain Cb and Teo regions revealed significant differences between Cb and Teo regarding the proteoform abundance. Over 700 proteoforms from 131 proteins had significantly higher abundance in Cb compared to Teo, and these proteins were highly enriched in several biological processes, including muscle contraction, glycolytic process, and mesenchyme migration.
Capillary zone electrophoresis (CZE)-tandem mass spectrometry (MS/MS) has been recognized as an efficient approach for top-down proteomics recently for its high-capacity separation and highly sensitive detection of proteoforms. However, the commonly used collision-based dissociation methods often cannot provide extensive fragmentation of proteoforms for thorough characterization. Activated ion electron transfer dissociation (AI-ETD), that combines infrared photoactivation concurrent with ETD, has shown better performance for proteoform fragmentation than higher energy-collisional dissociation (HCD) and standard ETD. Here, we present the first application of CZE-AI-ETD on an Orbitrap Fusion Lumos mass spectrometer for large-scale topdown proteomics of Escherichia coli (E. coli) cells. CZE-AI-ETD outperformed CZE-ETD regarding proteoform and protein identifications (IDs). CZE-AI-ETD reached comparable proteoform and protein IDs with CZE-HCD. CZE-AI-ETD tended to generate better expectation values (E-values) of proteoforms than CZE-HCD and CZE-ETD, indicating higher quality of
Top-down liquid chromatography-mass spectrometry (LC-MS) analyzes intact proteoforms and generates mass spectra containing peaks of proteoforms with various isotopic compositions, charge states, and retention times. An essential step in top-down MS data analysis is proteoform feature detection, which aims to group these peaks into peak sets (features), each containing all peaks of a proteoform. Accurate protein feature detection enhances the accuracy in MS-based proteoform identification and quantification. Here we present TopFD, a software tool for top-down MS feature detection that integrates algorithms for proteoform feature detection, feature boundary refinement, and machine learning models for proteoform feature evaluation. We performed extensive benchmarking of TopFD, Promex, FlashDeconv, and Xtract using five top-down MS data sets and demonstrated that TopFD outperforms other tools in feature accuracy, reproducibility, and feature abundance reproducibility.
Top-Down Proteomics (TDP) is an emerging proteomics protocol that involves identification, characterization, and quantitation of intact proteins using high-resolution mass spectrometry. TDP has an edge over other proteomics protocols in that it allows for: (i) accurate measurement of intact protein mass, (ii) high sequence coverage, and (iii) enhanced identification of post-translational modifications (PTMs). However, the complexity of TDP spectra poses a significant impediment to protein search and PTM characterization. Furthermore, limited software support is currently available in the form of search algorithms and pipelines. To address this need, we propose ‘SPECTRUM’, an open-architecture and open-source toolbox for TDP data analysis. Its salient features include: (i) MS2-based intact protein mass tuning, (ii) de novo peptide sequence tag analysis, (iii) propensity-driven PTM characterization, (iv) blind PTM search, (v) spectral comparison, (vi) identification of truncated proteins, (vii) multifactorial coefficient-weighted scoring, and (viii) intuitive graphical user interfaces to access the aforementioned functionalities and visualization of results. We have validated SPECTRUM using published datasets and benchmarked it against salient TDP tools. SPECTRUM provides significantly enhanced protein identification rates (91% to 177%) over its contemporaries. SPECTRUM has been implemented in MATLAB, and is freely available along with its source code and documentation at https://github.com/BIRL/SPECTRUM/ .
Top-down mass spectrometry has become the main method for intact proteoform identification, 14 characterization, and quantitation. Because of the complexity of top-down mass spectrometry 15 data, spectral deconvolution is an indispensable step in spectral data analysis, which groups 16 spectral peaks into isotopic envelopes and extracts monoisotopic masses of precursor or 17 fragment ions. The performance of spectral deconvolution methods relies heavily on their 18 scoring functions, which distinguish correct envelopes from incorrect ones. A good scoring 19 function increases the accuracy of deconvoluted masses reported from mass spectra. In this 20 paper, we present EnvCNN, a convolutional neural network-based model for evaluating isotopic 21 envelopes. We show that the model outperforms other scoring functions in distinguishing correct 22 envelopes from incorrect ones and that it increases the number of identifications and improves 23 the statistical significance of identifications in top-down spectral interpretation. 24
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.