2014
DOI: 10.1109/tcbb.2013.152
|View full text |Cite
|
Sign up to set email alerts
|

CAMS-RS: Clustering Algorithm for Large-Scale Mass Spectrometry Data Using Restricted Search Space and Intelligent Random Sampling

Abstract: High-throughput mass spectrometers can produce massive amounts of redundant data at an astonishing rate with many of them having poor signal-to-noise (S/N) ratio. These low S/N ratio spectra may not get interpreted using conventional spectra-to-database matching techniques. In this paper, we present an efficient algorithm, CAMS-RS (Clustering Algorithm for Mass Spectra using Restricted Space and Sampling) for clustering of raw mass spectrometry data. CAMS-RS utilizes a novel metric (called F-set) that exploits… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
3
3
1

Relationship

3
4

Authors

Journals

citations
Cited by 22 publications
(23 citation statements)
references
References 32 publications
0
23
0
Order By: Relevance
“…The UPS2 contains a mixture of 48 individual human sequence recombinant proteins, each of which has been selected to limit heterogenous post-translational modifications. The details experimental protocol can be seen in [13]. The increase in the confidence of the peptide match is shown in table 2.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…The UPS2 contains a mixture of 48 individual human sequence recombinant proteins, each of which has been selected to limit heterogenous post-translational modifications. The details experimental protocol can be seen in [13]. The increase in the confidence of the peptide match is shown in table 2.…”
Section: Resultsmentioning
confidence: 99%
“…The redundancy can reach up to 50% for large data sets [10, 11, 12]. Clustering of spectra from complex biological samples can also increase the sensitivity and confidence in peptide matches [13]. The increase in identifications can be attributed to the fact that clustering allows low S/N spectra to be grouped with high-quality spectra, which in a non-clustered data set would be eliminated from identification.…”
Section: Introductionmentioning
confidence: 99%
“…For all the experiments, we made use of the thirteen datasets we used before in [4] and [19]. Naming conventions for all thirteen data sets are same as in [4].…”
Section: Performance Evaluationmentioning
confidence: 99%
“…Pre-processing of MS data has been studied under three major categories i.e. clustering [19], noise reduction [9] and quality assessment [5]. Algorithms from all categories have a common goal of assisting in peptide deduction by improving the quality of peptide spectral matches using standard peptide deduction algorithms.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, mass spectrometry based proteomics is a problem of interest for precision medicine, cancer research and drug discovery. However, experiments in this domain produce big and complex data sets reaching peta-byte level [1] [3] [24]. Simple protein and metaproteomic library searches can take impractically long compute times [13][12].…”
Section: Introductionmentioning
confidence: 99%