2013
DOI: 10.1021/pr4001114
|View full text |Cite
|
Sign up to set email alerts
|

Predicting Tryptic Cleavage from Proteomics Data Using Decision Tree Ensembles

Abstract: Trypsin is the workhorse protease in mass spectrometry-based proteomics experiments and is used to digest proteins into more readily analyzable peptides. To identify these peptides after mass spectrometric analysis, the actual digestion has to be mimicked as faithfully as possible in silico. In this paper we introduce CP-DT (Cleavage Prediction with Decision Trees), an algorithm based on a decision tree ensemble that was learned on publicly available peptide identification data from the PRIDE repository. We de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
70
0
4

Year Published

2014
2014
2022
2022

Publication Types

Select...
8
1

Relationship

4
5

Authors

Journals

citations
Cited by 44 publications
(75 citation statements)
references
References 32 publications
1
70
0
4
Order By: Relevance
“…These authors identified amino acids flanking cleavage sites and strongly influencing digestion efficiency. These data were further confirmed by subsequent studies using different datasets 28,35 .…”
supporting
confidence: 70%
“…These authors identified amino acids flanking cleavage sites and strongly influencing digestion efficiency. These data were further confirmed by subsequent studies using different datasets 28,35 .…”
supporting
confidence: 70%
“…Software predicting cleavage probabilities exists for many proteases [86], with as usage mode the theoretical digestion of a single protein or mixture. For trypsin, for example, Cleaving prediction with decision trees (CP-DT) uses the positional information of amino acid sequences around the tryptic site to estimate whether or not the protein will be cleaved [88]. By ranking the peptides by the probability that they will occur after tryptic proteolysis, a list of peptides which can be potentially detected is generated.…”
Section: Protease Activitymentioning
confidence: 99%
“…The way each step in such pipeline transforms the sample depends on the characteristics on the instrument and its parameter settings. Recently, there is a growing interest in modeling more precisely each of these transformations (see [5] for an illustration on the digest step). The more accurate are such models, the more accurately one can reason about what was in the sample at the beginning based on the output of the detection step at the end of the pipeline.…”
Section: A Case Study In Experimentalmentioning
confidence: 99%