2008
DOI: 10.1073/pnas.0811066106
|View full text |Cite
|
Sign up to set email alerts
|

Discovery and revision of Arabidopsis genes by proteogenomics

Abstract: Gene annotation underpins genome science. Most often protein coding sequence is inferred from the genome based on transcript evidence and computational predictions. While generally correct, gene models suffer from errors in reading frame, exon border definition, and exon identification. To ascertain the error rate of Arabidopsis thaliana gene models, we isolated proteins from a sample of Arabidopsis tissues and determined the amino acid sequences of 144,079 distinct peptides by tandem mass spectrometry. The pe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

4
280
2
3

Year Published

2009
2009
2019
2019

Publication Types

Select...
7
3

Relationship

2
8

Authors

Journals

citations
Cited by 258 publications
(289 citation statements)
references
References 17 publications
4
280
2
3
Order By: Relevance
“…We also found that 37 annotated pseudogenes are actually expressed and translated (Table 1 and Dataset S1H). Mining publicly available proteomics data, four ORFs that we identified in either annotated ncRNAs or pseudogenes also have unique peptides detected by mass spectrometry (Dataset S1I) (45). The Western blots and the mass spectrometry data not only support the translation of these unannotated ORFs but also demonstrate that some of the ORFs produce stable proteins in plants.…”
Section: Enhancement Of 3-nt Periodicity Improves Identification Of Tmentioning
confidence: 65%
“…We also found that 37 annotated pseudogenes are actually expressed and translated (Table 1 and Dataset S1H). Mining publicly available proteomics data, four ORFs that we identified in either annotated ncRNAs or pseudogenes also have unique peptides detected by mass spectrometry (Dataset S1I) (45). The Western blots and the mass spectrometry data not only support the translation of these unannotated ORFs but also demonstrate that some of the ORFs produce stable proteins in plants.…”
Section: Enhancement Of 3-nt Periodicity Improves Identification Of Tmentioning
confidence: 65%
“…Sample preparation and MS are based on previously described methods (48)(49)(50) and are detailed in SI Materials and Methods. Briefly, the generated spectra were searched using the B73 RefGen_v2 5a WGS (14).…”
Section: Methodsmentioning
confidence: 99%
“…A number of pipelines have successfully exploited high-throughput tandem mass-spectrometry (MS/MS) data to predict novel proteins in H. sapiens (Fermin et al 2006;Tanner et al 2007;Bitton et al 2010), Caenorhabditis elegans (Schrimpf et al 2009), Drosophila melanogaster (Schrimpf et al 2009), Arabidopsis thaliana (Castellana et al 2008), and bacteria (Gupta et al 2007). We have therefore applied a pipeline that integrates this proteogenomics approach (Bitton et al 2010) with comparative genomics and genomewide domain prediction to augment the current annotation of S. pombe.…”
Section: T He Fission Yeast Schizosaccharomyces Pombe Is a Widelymentioning
confidence: 99%