2017
DOI: 10.1021/acs.jproteome.7b00427
|View full text |Cite
|
Sign up to set email alerts
|

Comparison and Evaluation of Clustering Algorithms for Tandem Mass Spectra

Abstract: In proteomics, liquid chromatography-tandem mass spectrometry (LC-MS/MS) is established for identifying peptides and proteins. Duplicated spectra, that is, multiple spectra of the same peptide, occur both in single MS/MS runs and in large spectral libraries. Clustering tandem mass spectra is used to find consensus spectra, with manifold applications. First, it speeds up database searches, as performed for instance by Mascot. Second, it helps to identify novel peptides across species. Third, it is used for qual… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
15
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(16 citation statements)
references
References 26 publications
(63 reference statements)
1
15
0
Order By: Relevance
“…Here, a protocol is presented that hierarchically clusters MS/MS spectra from metabolomics data and outputs fragment ion tables in an easy to interrogate manner. Many studies have explored numerous ways in clustering MS/MS spectra for proteomics analysis and mass spectral dereplication 16,[28][29][30][31] however, clustering MS/MS data for metabolomics has considerations not generally applicable to proteomics data such as distinguishing isobaric ions that are often dereplicated in many of these analysis pipelines 4 . The BioDendro protocol does not approach dereplication in the traditional manner of combining MS/ MS data of same precursor mass and high similarity spectra regardless of retention time but rather a feature has a single alignment to an MS/MS spectrum within an m/z and retention time tolerance.…”
Section: Discussionmentioning
confidence: 99%
“…Here, a protocol is presented that hierarchically clusters MS/MS spectra from metabolomics data and outputs fragment ion tables in an easy to interrogate manner. Many studies have explored numerous ways in clustering MS/MS spectra for proteomics analysis and mass spectral dereplication 16,[28][29][30][31] however, clustering MS/MS data for metabolomics has considerations not generally applicable to proteomics data such as distinguishing isobaric ions that are often dereplicated in many of these analysis pipelines 4 . The BioDendro protocol does not approach dereplication in the traditional manner of combining MS/ MS data of same precursor mass and high similarity spectra regardless of retention time but rather a feature has a single alignment to an MS/MS spectrum within an m/z and retention time tolerance.…”
Section: Discussionmentioning
confidence: 99%
“…Next, the pairwise distance matrix is used to cluster the data using the DBSCAN algorithm (figure 1D) [8, 33]. Briefly, if a given number of spectra are close to each other and form a dense data subspace, with closeness defined relative to a user-specified distance threshold, they will be grouped in clusters.…”
Section: Methodsmentioning
confidence: 99%
“…Two recent studies showed considerable differences in the evaluation of spectral clustering algorithms, with regard to accuracy, and compute performance. There are several unresolved challenges in this area.…”
Section: Computational Challengesmentioning
confidence: 99%