2004
DOI: 10.1021/pr049882h
|View full text |Cite
|
Sign up to set email alerts
|

Open Source System for Analyzing, Validating, and Storing Protein Identification Data

Abstract: This paper describes an open-source system for analyzing, storing, and validating proteomics information derived from tandem mass spectrometry. It is based on a combination of data analysis servers, a user interface, and a relational database. The database was designed to store the minimum amount of information necessary to search and retrieve data obtained from the publicly available data analysis servers. Collectively, this system was referred to as the Global Proteome Machine (GPM). The components of the sy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

4
501
0
2

Year Published

2007
2007
2017
2017

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 644 publications
(510 citation statements)
references
References 23 publications
4
501
0
2
Order By: Relevance
“…In 2004, Hillman et al [52] analyzed 1,363 human protein sequences deposited into the SwissProt database and found 107 entries (7.9% of those analyzed) that were derived from transcripts that are apparently subject to NMD. More recently an analysis of mass spectrometry data from the Global Proteome Machine [53] and PeptideAtlas [54] repositories reached the same conclusion: 3UI-containing transcripts can indeed express protein [55]. These results suggest that either the peptides detected in the above proteomics studies are the products of a single round of translation [37], or some endogenous human NMD substrates can undergo multiple rounds of translation prior to being decayed.…”
Section: Splicing Directs Mrnp Formationmentioning
confidence: 85%
“…In 2004, Hillman et al [52] analyzed 1,363 human protein sequences deposited into the SwissProt database and found 107 entries (7.9% of those analyzed) that were derived from transcripts that are apparently subject to NMD. More recently an analysis of mass spectrometry data from the Global Proteome Machine [53] and PeptideAtlas [54] repositories reached the same conclusion: 3UI-containing transcripts can indeed express protein [55]. These results suggest that either the peptides detected in the above proteomics studies are the products of a single round of translation [37], or some endogenous human NMD substrates can undergo multiple rounds of translation prior to being decayed.…”
Section: Splicing Directs Mrnp Formationmentioning
confidence: 85%
“…With the explosion of proteomics data in recent years, the time is ripe to revisit the idea, with some preliminary demonstration of success being reported in two recent publications [21,22]. The availability of online data repositories developed in recent years [23][24][25][26][27], as well as emerging unified standards for representing shotgun proteomics data, such as mzXML [28] and mzData (http://psi dev.sourceforge.net/ms/#mzdata), have made it possible to collect and catalog an adequate set of peptide CID spectra to create a high-quality spectral library. For the spectral library to be comprehensive (containing sufficient entries to cover a high proportion of the observed proteome) and accurate (containing high-quality and truly characteristic MS/MS spectra that can be confidently mapped to peptides), it is important to gather raw MS/MS spectra from a wide variety of sources, to identify them as accurately as possible, to filter out the inevitable false positives, and to process the spectra to reduce noise and other experimental artifacts.…”
Section: Introductionmentioning
confidence: 99%
“…The first of these challenges was the need for central and long‐term public repositories to store the generated data. Several such generic repositories are now available, for example PRIDE 4, GPMDB 5, PeptideAtlas 6, and MassIVE (http://massive.ucsd.edu/ProteoSAFe) for shotgun results; and PASSEL 7, SRMAtlas (http://www.srmatlas.org), and Panorama 8 for targeted proteomics quantification data. More specific databases have also been established, related to: diseases, for example TBDB for tuberculosis 9; organisms, for example ProteomicsDB 10 and the Human Proteome Map 11 for the human proteome, and pep2pro for Arabidopsis 12; or subproteomes, for example CSF‐PR 13 for cerebrospinal fluid or TOPPR 14 and TopFIND 15 for in vivo cleaved proteins.…”
Section: Introductionmentioning
confidence: 99%
“…It should be noted that some of the existing proteomics databases, most notably GPMDB 5 and PeptideAtlas 6, routinely reprocess their data using dedicated bioinformatics tools and pipelines. GPMDB makes use of the X!Tandem search engine 61, whereas PeptideAtlas employs the Trans Proteomic Pipeline 62.…”
Section: Introductionmentioning
confidence: 99%