2010
DOI: 10.1016/j.jprot.2010.06.007
|View full text |Cite
|
Sign up to set email alerts
|

Proteogenomics to discover the full coding content of genomes: A computational perspective

Abstract: Proteogenomics has emerged as a field at the junction of genomics and proteomics. It is a loose collection of technologies that allow the search of tandem mass spectra against genomic databases to identify and characterize protein-coding genes. Proteogenomic peptides provide invaluable information for gene annotation, which is difficult or impossible to ascertain using standard annotation methods. Examples include confirmation of translation, reading-frame determination, identification of gene and exon boundar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
144
0
1

Year Published

2011
2011
2016
2016

Publication Types

Select...
7
2

Relationship

0
9

Authors

Journals

citations
Cited by 143 publications
(146 citation statements)
references
References 96 publications
1
144
0
1
Order By: Relevance
“…These considerations are more intricate in proteogenomic projects that aim at genome annotation and discovery of novel gene models from shotgun proteomic data (58,59). The nature of these projects entails the use of large sequence databases that account for all possible protein coding regions of a genome.…”
Section: Data Set and Database Size Matter-mentioning
confidence: 99%
See 1 more Smart Citation
“…These considerations are more intricate in proteogenomic projects that aim at genome annotation and discovery of novel gene models from shotgun proteomic data (58,59). The nature of these projects entails the use of large sequence databases that account for all possible protein coding regions of a genome.…”
Section: Data Set and Database Size Matter-mentioning
confidence: 99%
“…Proteogenomic studies for various model organisms resorted to six frame translated genomic databases and expressed sequence tag (EST) 1 databases to achieve this goal (60 -64). The number of peptides in such databases is in the order of billions and further grows by two orders of magnitude if single amino acid mutations are considered, too (58). Several strategies have been pursued to faithfully compress these databases.…”
Section: Data Set and Database Size Matter-mentioning
confidence: 99%
“…Proteogenomics strategy stands out as an important experimental tool to identify the protein coding potential of sequenced or unsequenced genomes of an organism (Castellana et al, 2010;Krug et al, 2011). Proteogenomically identified peptide data can provide invaluable information for gene annotation, which is almost impossible or difficult to predict using nucleotide sequence information alone.…”
Section: Introductionmentioning
confidence: 99%
“…Over the last few years, computational proteomics has become a dramatically growing field, and a handful of tools have been developed to execute complete proteogenomic analyses (11,46,47). Two excellent reviews have described a comprehensive overview of the various problems commonly encountered and their current solutions for this growing research area (8,11).…”
mentioning
confidence: 99%