Large-scale protein identifications from highly complex protein mixtures have recently been achieved using multidimensional liquid chromatography coupled with tandem mass spectrometry (LC/LC-MS/MS) and subsequent database searching with algorithms such as SEQUEST. Here, we describe a probability-based evaluation of false positive rates associated with peptide identifications from three different human proteome samples. Peptides from human plasma, human mammary epithelial cell (HMEC) lysate, and human hepatocyte (Huh)-7.5 cell lysate were separated by strong cation exchange (SCX) chromatography coupled offline with reversed-phase capillary LC-MS/MS analyses. The MS/MS spectra were first analyzed by SEQUEST, searching independently against both normal and sequence-reversed human protein databases, and the false positive rates of peptide identifications for the three proteome samples were then analyzed and compared. The observed false positive rates of peptide identifications for human plasma were significantly higher than those for the human cell lines when identical filtering criteria were used, suggesting that the false positive rates are significantly dependent on sample characteristics, particularly the number of proteins found within the detectable dynamic range. Two new sets of filtering criteria are proposed for human plasma and human cell lines, respectively, to provide an overall confidence of >95% for peptide identifications. The new criteria were compared, using a normalized elution time (NET) criterion (Petritis et al. Anal. Chem. 2003, 75, 1039-1048), with previously published criteria (Washburn et al. Nat. Biotechnol. 2001, 19, 242-247). The results demonstrate that the present criteria provide significantly higher levels of confidence for peptide identifications from mammalian proteomes without greatly decreasing the number of identifications.
The use of artificial neural networks (ANNs) is described for predicting the reversed-phase liquid chromatography retention times of peptides enzymatically digested from proteome-wide proteins. To enable the accurate comparison of the numerous LC/MS data sets, a genetic algorithm was developed to normalize the peptide retention data into a range (from 0 to 1), improving the peptide elution time reproducibility to approximately 1%. The network developed in this study was based on amino acid residue composition and consists of 20 input nodes, 2 hidden nodes, and 1 output node. A data set of approximately 7000 confidently identified peptides from the microorganism Deinococcus radiodurans was used for the training of the ANN. The ANN was then used to predict the elution times for another set of 5200 peptides tentatively identified by MS/MS from a different microorganism (Shewanella oneidensis). The model was found to predict the elution times of peptides with up to 54 amino acid residues (the longest peptide identified after tryptic digestion of S. oneidensis) with an average accuracy of approximately 3%. This predictive capability was then used to distinguish with high confidence isobar peptides otherwise indistinguishable by accurate mass measurements as well as to uncover peptide misidentifications. Thus, integration of ANN peptide elution time prediction in the proteomic research will increase both the number of protein identifications and their confidence.
Molecular activation by blackbody photons, first postulated in 1919 by Perrin, plays a dominant role in the unimolecular dissociation of large ions trapped at low pressure in a Fourier-transform mass spectrometer. Under readily achievable experimental conditions, molecular ions of the protein ubiquitin equilibrate with the blackbody radiation field inside the vacuum chamber. The internal energy of a population of these ions is given by a Boltzmann distribution. From the temperature dependence of unimolecular dissociation rate constants measured in the zero-pressure limit, Arrhenius activation parameters equal to those in the high-pressure limit are obtained.
Ion mobility spectrometry (IMS) has been explored for decades, and its versatility in separation and identification of gas-phase ions is well established. Recently, field asymmetric waveform IMS (FAIMS) has been gaining acceptance in similar applications. Coupled to mass spectrometry (MS), both IMS and FAIMS have shown the potential for broad utility in proteomics and other biological analyses. A major attraction of these separations is extremely high speed, exceeding that of condensed-phase alternatives by orders of magnitude. However, modest separation peak capacities have limited the utility of FAIMS and IMS for analyses of complex mixtures. We report 2-D gasphase separations that join FAIMS to IMS, in conjunction with high-resolution and accuracy timeof-flight MS. Implementation of FAIMS/IMS and IMS/MS interfaces using electrodynamic ion funnels greatly improves sensitivity. Evaluation of FAIMS/IMS/TOF performance for a protein mixture tryptic digest reveals high orthogonality between FAIMS and IMS dimensions, and hence the benefit of FAIMS filtering prior to IMS/MS. The effective peak capacities in analyses of tryptic peptides are ~500 for FAIMS/IMS separations and ~10 6 for 3-D FAIMS/IMS/MS, providing a potential platform for ultrahigh-throughput analyses of complex mixtures.Among the greatest challenges of analytical chemistry today is characterizing samples of enormous complexity associated with systems biology research. For example, mammalian proteomes can comprise >20,000 different proteins even before counting post-translational modifications, and sequence and splicing variants. 1 A proteolytic digestion of such a mixture following standard protocols of bottom-up proteomics 1 would yield >10 6 distinct peptides, and more if missed cleavage sites due to the inevitable imperfections in enzyme activity are considered. Individual peptides are commonly identified and quantified using massspectrometry (MS) that offers excellent sensitivity, specificity, and dynamic range. 1 However, no technique can presently characterize a significant percentage of the constituents in such a complex sample without extensive prior separations, and direct MS analyses generally identify with confidence only the most abundant proteins. "Top-down" analyses at the intact-protein level have similar limitations. Accordingly, combinations of various separation techniques with MS have become preeminent bioanalytical technologies. The separations have conventionally been performed in condensed phases (e.g., liquid chromatography, 2 LC, or capillary electrophoresis, 3 CE, and gel techniques 4 ). Single separation stages can provide peak capacities 2 of ~10 2 -10 3 . This level is insufficient for many challenging applications: in a mixture of ~10 6 components, a separated fraction would still comprise ~10 3 -10 4 co-eluting species on average, and substantially more in some cases. Hence large-scale proteomics often involves multidimensional separations using two or more different stages, followed by MS characterization. The be...
The dissociation kinetics of protonated leucine enkephalin and its proton and alkali metal bound dimers were investigated by blackbody infrared radiative dissociation in a Fourier-transform mass spectrometer. From the temperature dependence of the unimolecular dissociation rate constants, Arrhenius activation parameters in the zero-pressure limit are obtained. Protonated leucine enkephalin dissociates to form b(4) and (M-H(2)O)(+) ions with an average activation energy (E(a)) of 1.1 eV and an A factor of 10(10.5) s(-1). The value of the A factor indicates that these dissociation processes are rearrangements. The b(4) ions subsequently dissociate to form a(4) ions via a process with a relatively high activation energy (1.3 eV), but one that is entropically favored. For the cationized dimers, the thermal stability decreases with increasing cation size, consistent with a simple electrostatic interaction in these noncovalent ion-molecule complexes. The E(a) and A factors are indistinguishable within experimental error with values of approximately 1.5 eV and 10(17) s(-1), respectively. Although not conclusive, results from master equation modeling indicate that all these BIRD processes, except for b(4) --> a(4), are in the rapid energy exchange limit. In this limit, the internal energy of the precursor ion population is given by a Boltzmann distribution and information about the energetics and dynamics of the reaction are obtained directly from the measured Arrhenius parameters.
A new quantitative cysteinyl-peptide enrichment technology (QCET) was developed to achieve higher efficiency, greater dynamic range, and higher throughput in quantitative proteomics that use stable-isotope labeling techniques combined with high-resolution liquid chromatography (LC)-mass spectrometry (MS). This approach involves (18)O labeling of tryptic peptides, high-efficiency enrichment of cysteine-containing peptides, and confident protein identification and quantification using the accurate mass and time tag strategy. Proteome profiling of naïve and in vitro-differentiated human mammary epithelial cells using QCET resulted in the identification and quantification of 603 proteins in a single LC-Fourier transform ion cyclotron resonance MS analysis. Advantages of this technology include the following: (1) a simple, highly efficient method for enriching cysteinyl-peptides; (2) a high-throughput strategy suitable for extensive proteome analysis; and (3) improved labeling efficiency for better quantitative measurements. This technology enhances both the functional analysis of biological systems and the detection of potential clinical biomarkers.
We describe the application of a peptide retention time reversed phase liquid chromatography (RPLC) prediction model previously reported (Petritis et al. Anal. Chem. 2003, 75, 1039) for improved peptide identification. The model uses peptide sequence information to generate a theoretical (predicted) elution time that can be compared with the observed elution time. Using data from a set of known proteins, the retention time parameter was incorporated into a discriminant function for use with tandem mass spectrometry (MS/MS) data analyzed with the peptide/protein identification program SEQUEST. For singly charged ions, the number of confident identifications increased by 12% when the elution time metric is included compared to when mass spectral data is the sole source of information in the context of a Drosophila melanogaster database. A 3-4% improvement was obtained for doubly and triply charged ions for the same biological system. Application to the larger Rattus norvegicus (rat) and human proteome databases resulted in an 8-9% overall increase in the number of confident identifications, when both the discriminant function and elution time are used. The effect of adding "runner-up" hits (peptide matches that are not the highest scoring for a spectra) from SEQUEST is also explored, and we find that the number of confident identifications is further increased by 1% when these hits are also considered. Finally, application of the discriminant functions derived in this work with approximately 2.2 million spectra from over three hundred LC-MS/MS analyses of peptides from human plasma protein resulted in a 16% increase in confident peptide identifications (9022 vs 7779) using elution time information. Further improvements from the use of elution time information can be expected as both the experimental control of elution time reproducibility and the predictive capability are improved.
We describe an improved artificial neural network (ANN)-based method for predicting peptide retention times in reversed phase liquid chromatography. In addition to the peptide amino acid composition, this study investigated several other peptide descriptors to improve the predictive capability, such as peptide length, sequence, hydrophobicity and hydrophobic moment, and nearest neighbor amino acid, as well as peptide predicted structural configurations (i.e., helix, sheet, coil). An ANN architecture that consisted of 1052 input nodes, 24 hidden nodes, and 1 output node was used to fully consider the amino acid residue sequence in each peptide. The network was trained using ~345,000 non-redundant peptides identified from a total of 12,059 LC-MS/MS analyses of more than 20 different organisms, and the predictive capability of the model was tested using 1303 confidently identified peptides that were not included in the training set. The model demonstrated an average elution time precision of ~1.5% and was able to distinguish among isomeric peptides based upon the inclusion of peptide sequence information. The prediction power represents a significant improvement over our earlier report (Petritis et al., Anal. Chem. 2003, 75, 1039-1048 and other previously reported models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.