Building and Searching Tandem Mass Spectral Libraries for Peptide Identification

Lam, Henry

doi:10.1074/mcp.r111.008565

Cited by 71 publications

(51 citation statements)

References 54 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Some tools take advantages of known spectra and use spectral library databases. 50 Many software packages are well-established for these purposes: examples include Sequest, 51 Mascot, 52 X!Tandem, 53 pfind, 54 Skyline, 55 Sonar, 56 ProbID, 57 Popitam, 58 and Andromeda. 59 For complete lists of software tools, please refer to the following recent reviews.…”

Section: Mass Spectrometry and Its Applications To Identify Histonmentioning

confidence: 99%

“…59 For complete lists of software tools, please refer to the following recent reviews. 48–50,60 Mascot and Sequest are the most widely used commercial search engines. Popular protein sequences database, for example, Uniprot, NCBInr, and International Protein Index (IPI), can be used to construct mass fingerprints database.…”

Section: Mass Spectrometry and Its Applications To Identify Histonmentioning

confidence: 99%

See 1 more Smart Citation

Quantitative Proteomic Analysis of Histone Modifications

et al. 2015

View full text Add to dashboard Cite

Section: Mass Spectrometry and Its Applications To Identify Histonmentioning

confidence: 99%

Section: Mass Spectrometry and Its Applications To Identify Histonmentioning

confidence: 99%

Quantitative Proteomic Analysis of Histone Modifications

et al. 2015

View full text Add to dashboard Cite

“…Spectral library (25) search is an emerging approach to peptide identification that relies on an archive of identified tandem mass spectra. Comparing spectra to previously identified spectra is faster, more sensitive, and more accurate than comparing spectra to a sequence database.…”

Section: Resultsmentioning

confidence: 99%

An Automated Proteogenomic Method Uses Mass Spectrometry to Reveal Novel Genes in Zea mays

Castellana

Shen

et al. 2014

Molecular & Cellular Proteomics

View full text Add to dashboard Cite

New technologies in genomics and proteomics have influenced the emergence of proteogenomics, a field at the confluence of genomics, transcriptomics, and proteomics. First generation proteogenomic toolkits employ peptide mass spectrometry to identify novel protein coding regions. We extend first generation proteogenomic tools to achieve greater accuracy and enable the analysis of large, complex genomes. We apply our pipeline to Zea mays, which has a genome comparable in size to human. Our pipeline begins with the comparison of mass spectra to a putative translation of the genome. We select novel peptides, those that match a region of the genome that was not previously known to be protein coding, for grouping into refinement events. We present a novel, probabilistic framework for evaluating the accuracy of each event. Our calculated event probability, or eventProb, considers the number of supporting peptides and spectra, and the quality of each supporting peptide-spectrum match. Our pipeline predicts 165 novel protein-coding genes and proposes updated models for 741 additional genes. Molecular & Cellular Proteomics 13: 10.1074/ mcp.M113.031260, 157-167, 2014.Accurate genome annotation, wherein the location and structure of all protein coding genes are identified, is critically important and yet it remains elusive for even the most extensively studied organisms. The wide availability of inexpensive next-generation sequencing technologies ensures that model organisms from all branches of the tree of life will continue to be sequenced at an ever increasing pace. However, the annotation pipelines are not able to keep up.Much recent focus on computational gene finding is on incorporating transcript evidence. As with genomic sequencing, availability of high-throughput technologies for transcript sequencing such as RNA-Seq (1) has dramatically changed the genome annotation landscape. Although RNA-Seq provides valuable evidence for genome annotation (2-5) it does not provide a comprehensive solution either. Increasing evidence suggests that a discrepancy exists between protein isoforms that are transcribed versus translated (6). Indeed in our own observation, we find evidence for genes in sampling proteins that are not visible at the transcript level. Moreover, the transcript evidence is confounded by prespliced messages, nontargeted expression noise, ncRNA, and lack of strand and frame information. All of these pose challenges for gene finding.Tandem mass spectrometry is a key technology for assaying the expressed proteome. In typical bottom-up workflows, enzymatically digested peptides are isolated via chromatography and then fragmented in the mass spectrometer. The collection of masses of peptide fragments (tandem mass spectrum) is used as a fingerprint for identification of expressed peptides.Historically, the genomics community has provided the annotations (aa sequences) and the proteomics community has focused on identifying peptides and proteins from this annotated list to assay for expression of proteins in specif...

show abstract

“…The former are used to derive relatively unsophisticated in silico theoretical fragmentation spectra (Cottrell 1994;Eng et al 1994;Yates et al 1995;Geer et al 2004;Fenyo and Beavis 2003), while the latter contain previously identified experimental MS/MS spectra (Lam 2011)(NIST: http://peptide.nist.gov) and have been shown to outperform sequence database searching under optimal conditions . One can also generate a realistic spectral library from a sequence database using detailed MS/MS spectrum prediction (Zhang 2004(Zhang , 2005Yen et al 2011;Yen et al 2009), but this is a more complicated process.…”

Section: From Acquired Data To Processed Resultsmentioning

confidence: 99%

Crowdsourcing in proteomics: public resources lead to better experiments

Barsnes

Martens

2013

Amino Acids

View full text Add to dashboard Cite

With the growing interest in the field of proteomics, the amount of publicly available proteome resources has also increased dramatically. This means that there are many useful resources available for almost all aspects of a proteomics experiment.However, it remains vital to use the right resource, for the right purpose, at the right time. This review is therefore meant to aid the reader in obtaining an overview of the available resources and their application, thus providing the necessary background to choose the appropriate resources for the experiment at hand. Many of the resources are also taking advantage of so-called crowdsourcing to maximize the potential of the resource. What this means and how this can improve future experiments will also be discussed. The text roughly follows the steps involved in a proteomics experiment, starting with the planning of the experiment, via the processing of the data and the analysis of the results, to the community-wide sharing of the produced data. 3 Main Text Background

show abstract

Building and Searching Tandem Mass Spectral Libraries for Peptide Identification

Cited by 71 publications

References 54 publications

Quantitative Proteomic Analysis of Histone Modifications

Quantitative Proteomic Analysis of Histone Modifications

An Automated Proteogenomic Method Uses Mass Spectrometry to Reveal Novel Genes in Zea mays

Crowdsourcing in proteomics: public resources lead to better experiments

Contact Info

Product

Resources

About