2003
DOI: 10.1002/rcm.1198
|View full text |Cite
|
Sign up to set email alerts
|

A method for reducing the time required to match protein sequences with tandem mass spectra

Abstract: An algorithm for reducing the time necessary to match a large set of peptide tandem mass spectra with a list of protein sequences is described. This algorithm breaks the process into multiple steps. A rapid survey step identifies all protein sequences that are reasonable candidates for a match with a set of tandem mass spectra. These candidates are then used as models, which are refined by detailed analysis of the set of tandem mass spectra for evidence of incomplete enzymatic hydrolysis, non-specific hydrolys… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
396
0

Year Published

2004
2004
2015
2015

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 465 publications
(401 citation statements)
references
References 29 publications
0
396
0
Order By: Relevance
“…This is attractive as it allows typically expensive calculations to be computed quickly because of the greatly reduced search space. This mode of searching is implemented as "error tolerant" mode in Mascot (44), "refinement" mode in Tandem (45), "unassigned single mass gap" mode coupled with searching previous hits in Spectrum Mill, and OMSSA's interactive search. In a related strategy, the Paragon algorithm in ProteinPilot (46) does not focus on subset proteins but rather determines which sequence regions to evaluate more thoroughly during a search using a combination of statistics (using sequence tags to compute Sequence Temperature Values) and search feature probabilities.…”
Section: Figmentioning
confidence: 99%
“…This is attractive as it allows typically expensive calculations to be computed quickly because of the greatly reduced search space. This mode of searching is implemented as "error tolerant" mode in Mascot (44), "refinement" mode in Tandem (45), "unassigned single mass gap" mode coupled with searching previous hits in Spectrum Mill, and OMSSA's interactive search. In a related strategy, the Paragon algorithm in ProteinPilot (46) does not focus on subset proteins but rather determines which sequence regions to evaluate more thoroughly during a search using a combination of statistics (using sequence tags to compute Sequence Temperature Values) and search feature probabilities.…”
Section: Figmentioning
confidence: 99%
“…Specifically, if one of the peaks in this series has a fragmentation spectrum, the spectrum can be used to search a database of predicted spectra based on the genome sequence of the species under study (5)(6)(7). From a high-scoring match to the database, one can determine the sequence of the tryptic fragment and the protein from which the fragment originated.…”
Section: Background Lc-ms/ms Datamentioning
confidence: 99%
“…We note that such database search is in itself a difficult computational challenge, one that is often addressed independently of quantification. We rely on existing algorithms for database search (5,6).…”
Section: Background Lc-ms/ms Datamentioning
confidence: 99%
“…Database search approaches have been described that iterate the search multiple times to significantly reduce the number of possible sequences considered. The result is that searches are accomplished more quickly, with fewer computational resources and with more stringent parameters [51,52]. The large amount of research in this field outlines the importance of the problem and although significant advances have been made, there is still much to be accomplished to address the challenges in this domain.…”
Section: Introductionmentioning
confidence: 99%
“…A goal of this strategy is to initially reduce the number of peptide sequences taken into account, in order to expedite subsequent MS 2 analysis. In this manner, it contains some similarity to previously described iteratively refined MS 2 -based search methods [51,52], but instead uses MS 1 interrogation as the first step. This approach is amenable for an automated computation and has been implemented in a software framework to resolve some of the key limitations of the MS 2 based methods: examination of nonfragmented peaks, identification of protein modifications that are associated with poor fragmentation patterns and computational algorithmic enhancements.…”
Section: Introductionmentioning
confidence: 99%