2020
DOI: 10.1101/2020.11.12.380881
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deep Semi-Supervised Learning Improves Universal Peptide Identification of Shotgun Proteomics Data

Abstract: In proteomic analysis pipelines, machine learning post-processors play a critical role in improving the accuracy of shotgun proteomics analysis. Most often performed in a semi-supervised manner, such post-processors accept the peptide-spectrum matches (PSMs) and corresponding feature vectors resulting from a database search, train a machine learning classifier, and recalibrate PSM scores based on the resulting trained parameters, often leading to significantly more identified peptides across q-value thresholds… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(10 citation statements)
references
References 44 publications
(65 reference statements)
0
10
0
Order By: Relevance
“…Recently, several deep learning methods have been proposed, such as pDeep, Prosit, DeepMass, and ProteoTorchDNN, to improve the number of annotations. Prosit extracts nearly 60 features for every spectrum annotation obtained with the Andromeda/MaxQuant system, including the application of the intensities of the theoretical spectra predicted by deep LSTM networks.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Recently, several deep learning methods have been proposed, such as pDeep, Prosit, DeepMass, and ProteoTorchDNN, to improve the number of annotations. Prosit extracts nearly 60 features for every spectrum annotation obtained with the Andromeda/MaxQuant system, including the application of the intensities of the theoretical spectra predicted by deep LSTM networks.…”
Section: Resultsmentioning
confidence: 99%
“…Recently, several deep learning methods have been proposed, such as pDeep, 46 Prosit, 23 DeepMass, 47 and Proteo-TorchDNN, 25 to improve the number of annotations. Prosit extracts nearly 60 features for every spectrum annotation obtained with the Andromeda/MaxQuant system, 7 We have performed a grid search over the convolutional kernel number and its width to find the best architecture for Slider using the HumVar data set.…”
Section: Comparison With Prositmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, PSM rescoring in ANN-SoLo can trivially switch between the standard linear SVM, as used by Percolator, 6 or a random forest (Supplementary figure S5), which can learn a non-linear decision function and which has previously been shown to have beneficial performance during PSM rescoring. 8,33 Additionally, other machine learning models, such as neural networks, 9 could effortlessly be integrated as well if they support the popular Scikit-Learn application programming interface (API). 55 Furthermore, being able to directly manipulate the PSM rescoring functionality allowed us to get deeper insights into the workings of the machine learning model, for example, to perform a detailed analysis of the feature importances.…”
Section: Discussionmentioning
confidence: 99%
“…This makes it possible to improve the spectrum identification results by accepting more target PSMs at a specific FDR threshold. Since the original description of Percolator, 6 several approaches that extend upon this idea have been proposed, including tools that use different machine learning models instead of a linear SVM, [7][8][9] integration with various search engines, [10][11][12] and additional features that the classifier can use, often powered by deep learning. [13][14][15][16][17][18][19][20] Another recent development is the emergence of novel search engines that can efficiently perform so-called open modification searching (OMS).…”
Section: Introductionmentioning
confidence: 99%