2007
DOI: 10.1021/pr7006818
|View full text |Cite
|
Sign up to set email alerts
|

Statistical Validation of Peptide Identifications in Large-Scale Proteomics Using the Target-Decoy Database Search Strategy and Flexible Mixture Modeling

Abstract: Reliable statistical validation of peptide and protein identifications is a top priority in large-scale mass spectrometry based proteomics. PeptideProphet is one of the computational tools commonly used for assessing the statistical confidence in peptide assignments to tandem mass spectra obtained using database search programs such as SEQUEST, MASCOT, or X! TANDEM. We present two flexible methods, the variable component mixture model and the semiparametric mixture model, that remove the restrictive parametric… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
161
0

Year Published

2007
2007
2014
2014

Publication Types

Select...
10

Relationship

2
8

Authors

Journals

citations
Cited by 112 publications
(163 citation statements)
references
References 26 publications
0
161
0
Order By: Relevance
“…The whole dataset of the peptide identifications in practice is usually sufficiently large that FDR(x) can always be accurately estimated by using the target-decoy strategy or more complicated approaches (24,25). However, the number of identifications of k-modified peptides could be too small to support a separate, accurate estimation of FDR k (x).…”
Section: And Methodsmentioning
confidence: 99%
“…The whole dataset of the peptide identifications in practice is usually sufficiently large that FDR(x) can always be accurately estimated by using the target-decoy strategy or more complicated approaches (24,25). However, the number of identifications of k-modified peptides could be too small to support a separate, accurate estimation of FDR k (x).…”
Section: And Methodsmentioning
confidence: 99%
“…This is a different and less powerful classification rule than PeptideProphet would ordinarily compute, which uses densities and results in spectra specific error rates, but we use it here in order to compare this with the decoy database threshold-based approaches. See equation (7) on page 11 of Choi and Nesvizhskii [2] for a more general calculation. When searched against a decoy database, this PeptideProphet FIR may be estimated empirically by (# decoy > x)/(# target > x), which is equal to the classic target FIR described above.…”
Section: Perspectivementioning
confidence: 99%
“…The search included fixed modification of carbamidomethylation on cysteine, and variable modifications: oxidation (Met), Gln 3 pyro-Glu (N-terminal Gln), Glu 3 pyro-Glu (N-terminal Glu), and deamidation (O 16 ) on both asparagine and glutamine, and O 18 -incorporated deamidation on asparagine. The FDR analysis used the R implementation of a flexible mixture model as described by Choi et al to calculate global and local peptide-level false discovery rates (23). The false discovery rate (FDR) was controlled at 1% at the peptide level by searching the same data set against a decoy database, and a minimum of two unique peptides per protein was required for the identification of proteins.…”
Section: Methodsmentioning
confidence: 99%