2015
DOI: 10.1093/bioinformatics/btv340
|View full text |Cite
|
Sign up to set email alerts
|

A note on the false discovery rate of novel peptides in proteogenomics

Abstract: Motivation: Proteogenomics has been well accepted as a tool to discover novel genes. In most conventional proteogenomic studies, a global false discovery rate is used to filter out false positives for identifying credible novel peptides. However, it has been found that the actual level of false positives in novel peptides is often out of control and behaves differently for different genomes.Results: To quantitatively model this problem, we theoretically analyze the subgroup false discovery rates of annotated a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
43
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 29 publications
(43 citation statements)
references
References 19 publications
0
43
0
Order By: Relevance
“…To accurately identify novel peptides, it is necessary to filter the search results to control the FDR. Recently, more stringent filtering strategies, such as posterror probability (89) or separate FDRs for annotated and novel peptides (46,90) were employed in proteomic analysis. In this work, we used a more stringent strategy (novel FDR) to increase the accuracy of identified novel peptides (76,77).…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…To accurately identify novel peptides, it is necessary to filter the search results to control the FDR. Recently, more stringent filtering strategies, such as posterror probability (89) or separate FDRs for annotated and novel peptides (46,90) were employed in proteomic analysis. In this work, we used a more stringent strategy (novel FDR) to increase the accuracy of identified novel peptides (76,77).…”
Section: Resultsmentioning
confidence: 99%
“…In this work, we used a more stringent strategy (novel FDR) to increase the accuracy of identified novel peptides (76,77). However, because of the lower expression levels or worse scores of the novel peptides in proteogenomic study, the FDR calculation is difficult to explicitly estimate to exclude false peptides (90). To further confirm the identified novel genes, GAPP applied a postprocessing of identification strategy, including the novel protein sequence alignment and GO functional annotation.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Zhang et al. investigated both global and separate FDR settings respectively for sub‐groups of annotated and novel peptides and demonstrated that the annotation completeness ratio of a genome is the dominant factor influencing the subgroup FDR of novel peptides . Another popular method (such as the Percolator) to estimate FDR is by coupling semi‐supervised learning to a decoy database search strategy so as to discriminate correct and false PSMs; often with extra information (e.g., retention time of chromatography, peptide charge state, or mass accuracy) from MS/MS experiments to re‐score PSMs .…”
Section: Current Development In Enabling Technologiesmentioning
confidence: 99%
“…In contrast, proteogenomics studies investigating alternative splicing or noncoding regions require custom made databases containing "novel" sequences predicted either from genome or transcriptome (ESTs, RNASeq, etc.) However, the identification of such novel peptide variants in proteogenomics studies are still susceptive to higher than expected falsediscovery rates (FDR) [8,9], mostly due to the larger search space of the customized databases (and consequently higher chance of spurious matching). However, the identification of such novel peptide variants in proteogenomics studies are still susceptive to higher than expected falsediscovery rates (FDR) [8,9], mostly due to the larger search space of the customized databases (and consequently higher chance of spurious matching).…”
Section: Introductionmentioning
confidence: 99%