The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2018
DOI: 10.1021/acs.jproteome.7b00767
|View full text |Cite
|
Sign up to set email alerts
|

A Matter of Time: Faster Percolator Analysis via Efficient SVM Learning for Large-Scale Proteomics

Abstract: Percolator is an important tool for greatly improving the results of a database search and subsequent downstream analysis. Using support vector machines (SVMs), Percolator recalibrates peptide-spectrum matches based on the learned decision boundary between targets and decoys. To improve analysis time for large-scale data sets, we update Percolator's SVM learning engine through software and algorithmic optimizations rather than heuristic approaches that necessitate the careful study of their impact on learned p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 20 publications
0
14
0
Order By: Relevance
“…We screened out significant risk genes through feature selection and optimization. An SVM model was trained using ten-fold crossvalidation [18]. The SVM model is a supervised classification algorithm of machine learning.…”
Section: Construction Of Classification Model By Svmmentioning
confidence: 99%
“…We screened out significant risk genes through feature selection and optimization. An SVM model was trained using ten-fold crossvalidation [18]. The SVM model is a supervised classification algorithm of machine learning.…”
Section: Construction Of Classification Model By Svmmentioning
confidence: 99%
“…The second bottleneck is the execution time required to learn SVM parameters. Recent work 11 has tackled this bottleneck through software optimizations to Percolator's SVM learning engine, and our efforts complement and further improve upon these optimizations. On a massive data set containing over 215 million PSMs, the new version of Percolator achieves an overall speedup of 439% (81.4 h down to 18.6 h).…”
Section: Contributionsmentioning
confidence: 99%
“…Finally, we optimized the CGLS solver itself using a mixture of low-level linear algebra function calls and software streamlining, as described previously. 11 Optimizations are compared against the recently described CGLS multithreaded speedup, 11 referred to as CGLS-par. In contrast to the second in our series of optimizations, which uses multiple threads to parallelize runs of CGLS at the crossvalidation level, CGLS-par instead uses multiple threads to parallelize computation within the CGLS algorithm.…”
Section: Software Optimizationmentioning
confidence: 99%
“…Recent advances in machine learning tools and widespread use of high throughput techniques provides a massive amount of data as a source to develop tools for every step in MSbased workflows (Bouwmeester et al, 2020). For example, the post-processing tool Percolator (Käll et al, 2007;Halloran and Rocke, 2018) integrates several features into a semi-supervised learning algorithm to improve the distinction between true and false peptide-spectrum matches. Next to that, spectrum intensity predictors, such as MS 2 PIP (Degroeve et al, 2015;Gabriels et al, 2019) and Prosit (Gessulat et al, 2019) are new models that incorporate fragment ion intensities predictions as additional features next to the standard m/z ratio during spectral library searching to increase the resolution of the identification, even in challenging workflows such as proteogenomics (Verbruggen et al, 2021).…”
Section: Introductionmentioning
confidence: 99%