We present a mass spectral library-based method to identify tandem mass spectra of peptides that contain unanticipated modifications and amino acid variants. We describe this as a "hybrid" method because it combines matching both ion m/z and mass losses. The mass loss is the difference between the mass of an ion peak and the mass of its precursor. This difference, termed DeltaMass, is used to shift the product ions in the library spectrum that contain the modification, thereby allowing library product ions that contain the unexpected modification to match the query spectrum. Clustered unidentified spectra from the Clinical Proteomic Tumor Analysis Consortium (CPTAC) and Chinese hamster ovary cells were used to evaluate this method. The results demonstrate the ability of the hybrid method to identify unanticipated modifications, insertions, and deletions, which may include those due to an incomplete protein sequence database or to search settings that exclude the correct identification, in high-resolution tandem mass spectra without regard to their precursor mass. This has been made possible by indexing of the m/z value of each fragment ion and its difference in mass from its precursor ion.
We report the development and availability of a mass spectral reference library for oligosaccharides in human milk. This represents a new variety of spectral library that includes consensus spectra of compounds annotated through various data analysis methods, a concept that can be extended to other varieties of biological fluids. Oligosaccharides from the NIST Standard Reference Material (SRM) 1953, composed of human milk pooled from 100 breastfeeding mothers, were identified and characterized using hydrophilic interaction liquid chromatography electrospray ionization tandem mass spectrometry (HILIC-ESI-MS/MS) and the NIST 17 Tandem MS Library. Consensus reference spectra were generated, incorporated into a searchable library, and matched using the newly developed hybrid search algorithm to elucidate unknown oligosaccharides. The NIST hybrid search program facilitates the structural assignment of complex oligosaccharides especially when reference standards are not commercially available. High accuracy mass measurement for precursor and product ions, as well as the relatively high MS/MS signal intensities of various oligosaccharide precursors with Fourier transform ion trap (FT-IT) and higher energy dissociation (HCD) fragmentation techniques, enabled the assignment of multiple free and underivatized fucosyllacto- and sialyllacto-oligosaccharide spectra. Neutral and sialylated isomeric oligosaccharides have distinct retention times, allowing the identification of 74 oligosaccharides in the reference material. This collection of newly characterized spectra based on a searchable, reference MS library of annotated oligosaccharides can be applied to analyze similar compounds in other types of milk or any biological fluid containing milk oligosaccharides.
The Clinical Proteomic Tumor Analysis Consortium (CPTAC) has produced large proteomics datasets from the mass spectrometric interrogation of tumor samples previously analyzed by The Cancer Genome Atlas (TCGA) program. The availability of the genomic and proteomic data is enabling proteogenomic study for both reference (i.e., contained in major sequence databases) and non-reference markers of cancer. The CPTAC labs have focused on colon, breast, and ovarian tissues in the first round of analyses; spectra from these datasets were produced from 2D LC-MS/MS analyses and represent deep coverage. To reduce the variability introduced by disparate data analysis platforms (e.g., software packages, versions, parameters, sequence databases, etc.), the CPTAC Common Data Analysis Platform (CDAP) was created. The CDAP produces both peptide-spectrum-match (PSM) reports and gene-level reports. The pipeline processes raw mass spectrometry data according to the following: (1) Peak-picking and quantitative data extraction, (2) database searching, (3) gene-based protein parsimony, and (4) false discovery rate (FDR)-based filtering. The pipeline also produces localization scores for the phosphopeptide enrichment studies using the PhosphoRS program. Quantitative information for each of the datasets is specific to the sample processing, with PSM and protein reports containing the spectrum-level or gene-level (“rolled-up”) precursor peak areas and spectral counts for label-free or reporter ion log-ratios for 4plex iTRAQ™. The reports are available in simple tab-delimited formats and, for the PSM-reports, in mzIdentML. The goal of the CDAP is to provide standard, uniform reports for all of the CPTAC data, enabling comparisons between different samples and cancer types as well as across the major ‘omics fields.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.