Proteome characterization relies heavily on tandem mass spectrometry (MS/MS) and is thus associated with instrumentation complexity, lengthy analysis time, and limited duty-cycle. It was always tempting to implement approaches which do not require MS/MS, yet, they were constantly failing in achieving meaningful depth of quantitative proteome coverage within short experimental times, which is particular important for clinical or biomarker discovery applications. Here, we report on the first successful attempt to develop a truly MS/MS-free and label-free method for bottom-up proteomics. We demonstrate identification of 1000 protein groups for a standard HeLa cell line digest using 5-minute LC gradients. The amount of loaded sample was varied in a range from 1 ng to 500 ng, and the method demonstrated 10-fold higher sensitivity compared with the standard MS/MS-based approach. Due to significantly higher sequence coverage obtained by the developed method, it outperforms all popular MS/MSbased label-free quantitation approaches. Advances in mass-spectrometry-based proteomic technologies resulted in dramatically increased depth, throughput, and sensitivity of proteome coverage. Up to 10,000 proteins can be identified in an 100 minute analysis of human cell proteomes using state-of-the-art high-resolution Orbitrap mass spectrometry 1. Recently, the notable trend in LC-MS technology developments has been toward increasing the throughput of the proteome-wide analysis, while preserving the quantitation accuracy 2,3. However, these achievements rely heavily on the use of tandem mass spectrometry (MS/MS), which includes sequential isolation of eluting peptides followed by their fragmentation. While being a crucial and seemingly the only source of sequence-specific information about the peptides, MS/MS brings a number of well-known challenges. Due to the limited both the speed of the mass analyzer (which is
We present an open-source, extensible search engine for shotgun proteomics. Implemented in Python programming language, IdentiPy shows competitive processing speed and sensitivity compared with the state-of-the-art search engines. It is equipped with a user-friendly web interface, IdentiPy Server, enabling the use of a single server installation accessed from multiple workstations. Using a simplified version of X!Tandem scoring algorithm and its novel "autotune" feature, IdentiPy outperforms the popular alternatives on high-resolution data sets. Autotune adjusts the search parameters for the particular data set, resulting in improved search efficiency and simplifying the user experience. IdentiPy with the autotune feature shows higher sensitivity compared with the evaluated search engines. IdentiPy Server has built-in postprocessing and protein inference procedures and provides graphic visualization of the statistical properties of the data set and the search results. It is open-source and can be freely extended to use third-party scoring functions or processing algorithms and allows customization of the search workflow for specialized applications.
Data-dependent tandem mass spectrometry (MS/MS) is one of the main techniques for protein identification in shotgun proteomics. In a typical LC-MS/MS workflow, peptide product ion mass spectra (MS/MS spectra) are compared with those derived theoretically from a protein sequence database. Scoring of these matches results in peptide identifications. A set of peptide identifications is characterized by false discovery rate (FDR), which determines the fraction of false identifications in the set. The total number of peptides targeted for fragmentation is in the range of 10,000 to 20,000 for a several-hour LC-MS/MS run. Typically, <50% of these MS/MS spectra result in peptide-spectrum matches (PSMs). A small fraction of PSMs pass the preset FDR level (commonly 1%) giving a list of identified proteins, yet a large number of correct PSMs corresponding to the peptides originally present in the sample are left behind in the "grey area" below the identity threshold. Following the numerous efforts to recover these correct PSMs, here we investigate the utility of a scoring scheme based on the multiple PSM descriptors available from the experimental data. These descriptors include retention time, deviation between experimental and theoretical mass, number of missed cleavages upon in-solution protein digestion, precursor ion fraction (PIF), PSM count per sequence, potential modifications, median fragment mass error, (13)C isotope mass difference, charge states, and number of PSMs per protein. The proposed scheme utilizes a set of metrics obtained for the corresponding distributions of each of the descriptors. We found that the proposed PSM scoring algorithm differentiates equally or more efficiently between correct and incorrect identifications compared with existing postsearch validation approaches.
LC combined with MS/MS analysis of complex mixtures of protein digests is a reliable and sensitive method for characterization of protein phosphorylation. Peptide retention times (RTs) measured during an LC-MS/MS run depend on both the peptide sequence and the location of modified amino acids. These RTs can be predicted using the LC of biomacromolecules at critical conditions model (BioLCCC). Comparing the observed RTs to those obtained from the BioLCCC model can provide additional validation of MS/MS-based peptide identifications to reduce the false discovery rate and to improve the reliability of phosphoproteome profiling. In this study, energies of interaction between phosphorylated residues and the surface of RP separation media for both "classic" alkyl C18 and polar-embedded C18 stationary phases were experimentally determined and included in the BioLCCC model extended for phosphopeptide analysis. The RTs for phosphorylated peptides and their nonphosphorylated analogs were predicted using the extended BioLCCC model and compared with their experimental RTs. The extended model was evaluated using literary data and a complex phosphoproteome data set distributed through the Association of Biomolecular Resource Facilities Proteome Informatics Research Group 2010 study. The reported results demonstrate the capability of the extended BioLCCC model to predict RTs which may lead to improved sensitivity and reliability of LC-MS/MS-based phosphoproteome profiling.
In this work, we present the results of evaluation of a workflow that employs a multienzyme digestion strategy for MS1-based protein identification in "shotgun" proteomic applications. In the proposed strategy, several cleavage reagents of different specificity were used for parallel digestion of the protein sample followed by MS1 and retention time (RT) based search. Proof of principle for the proposed strategy was performed using experimental data obtained for the annotated 48-protein standard. By using the developed approach, up to 90% of proteins from the standard were unambiguously identified. The approach was further applied to HeLa proteome data. For the sample of this complexity, the proposed MS1-only strategy determined correctly up to 34% of all proteins identified using standard MS/MS-based database search. It was also found that the results of MS1-only search were independent of the chromatographic gradient time in a wide range of gradients from 15-120 min. Potentially, rapid MS1-only proteome characterization can be an alternative or complementary to the MS/MS-based "shotgun" analyses in the studies, in which the experimental time is more important than the depth of the proteome coverage.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.