2017
DOI: 10.1002/mas.21543
|View full text |Cite
|
Sign up to set email alerts
|

Anatomy and evolution of database search engines—a central component of mass spectrometry based proteomic workflows

Abstract: Sequence database search engines are bioinformatics algorithms that identify peptides from tandem mass spectra using a reference protein sequence database. Two decades of development, notably driven by advances in mass spectrometry, have provided scientists with more than 30 published search engines, each with its own properties. In this review, we present the common paradigm behind the different implementations, and its limitations for modern mass spectrometry datasets. We also detail how the search engines a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
114
0
2

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 112 publications
(117 citation statements)
references
References 173 publications
1
114
0
2
Order By: Relevance
“…Search algorithms frequently employed in HLA ligandomics studies are Mascot, COMET and its predecessor SEQUEST, and Andromeda integrated into the MaxQuant framework . Unbiased de novo spectrum sequencing with PEAKS DB to supplement database search increases the identification rate, without enlarging the search space …”
Section: Bioinformatic Analysis Of Ms Datamentioning
confidence: 99%
See 1 more Smart Citation
“…Search algorithms frequently employed in HLA ligandomics studies are Mascot, COMET and its predecessor SEQUEST, and Andromeda integrated into the MaxQuant framework . Unbiased de novo spectrum sequencing with PEAKS DB to supplement database search increases the identification rate, without enlarging the search space …”
Section: Bioinformatic Analysis Of Ms Datamentioning
confidence: 99%
“…4 Unbiased de novo spectrum sequencing with PEAKS DB 74 to supplement database search increases the identification rate, without enlarging the search space. 13,71 All database search strategies function in a similar manner: they take an MS2 spectrum as input and compare it against theoretical fragmentation patterns constructed from the database queried. The search output is a list of peptide sequences ranked according to the scoring scheme implemented in each algorithm.…”
Section: Bioinformatic Analysis Of Ms Datamentioning
confidence: 99%
“…To compute a BH-FDR, one needs a search engine providing PSM scores which can be related to p-values. Fortunately, numerous state-of-the-art search engines do so [23]: For instance, Mascot and Andromeda provide scores of the form S = −10 · log 10 (p) or p = 10 − S…”
Section: Tdc Accuracy Depends On Mass Tolerances Set During Database mentioning
confidence: 99%
“…Proteomics is a field that relies heavily on mass spectrometry for the identification of proteins in a sample 1 . The identification process of the acquired fragmentation mass spectra is carried out using bioinformatics tools called search engines that match experimentally obtained mass spectra to peptide sequences 23 .…”
Section: Introductionmentioning
confidence: 99%
“…Sequence database search engines, who are by far the most popular kind of search engine, assign sequences to spectra by generating theoretical spectra for each potential sequence, matching it to the experimental spectra and attributing a score to each match 23 . These theoretical spectra are typically very simple: for example, SEQUEST 10 creates these by assigning an arbitrary magnitude of 50 for b-and y-ion fragment peaks, of 25 for ions with m/z equal to +-1u from the b-and y-ion fragments, and of 10 for ions associated with neutral losses of water or ammonia.…”
Section: Introductionmentioning
confidence: 99%