2005
DOI: 10.1073/pnas.0409421102
|View full text |Cite
|
Sign up to set email alerts
|

A computational and experimental approach to validating annotations and gene predictions in theDrosophila melanogastergenome

Abstract: Five years after the completion of the sequence of the Drosophila melanogaster genome, the number of protein-coding genes it contains remains a matter of debate; the number of computational gene predictions greatly exceeds the number of validated gene annotations. We have assembled a collection of >10,000 gene predictions that do not overlap existing gene annotations and have developed a process for their validation that allows us to efficiently prioritize and experimentally validate predictions from various s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
22
0

Year Published

2005
2005
2012
2012

Publication Types

Select...
6
4

Relationship

2
8

Authors

Journals

citations
Cited by 33 publications
(24 citation statements)
references
References 19 publications
2
22
0
Order By: Relevance
“…To address this possibility, we used transcript predictions from ab initio prediction programs [17,18] in order to eliminate all upstream sequences that contained predicted transcripts (see Materials and Methods). These programs have a relatively liberal definition of genes [19], which allowed us to be more stringent in identifying sequences that do not contain genes in them. Our more stringent set included 2,390 genes.…”
Section: Resultsmentioning
confidence: 99%
“…To address this possibility, we used transcript predictions from ab initio prediction programs [17,18] in order to eliminate all upstream sequences that contained predicted transcripts (see Materials and Methods). These programs have a relatively liberal definition of genes [19], which allowed us to be more stringent in identifying sequences that do not contain genes in them. Our more stringent set included 2,390 genes.…”
Section: Resultsmentioning
confidence: 99%
“…Gene number is one such comparison. Though protein-coding gene numbers have been a subject of controversy, most annotated model Eukaryotes contain on the order of 15,000-25,000 protein-coding genes (for discussion, see Yandell et al 2005). Drosophila, e.g., is believed to contain fewer than 15,000 protein coding genes , and the WS160 WormBase release puts the number of C. elegans genes at slightly less than 20,000.…”
Section: Protein-coding Gene Numbersmentioning
confidence: 99%
“…Comparative information has improved computational gene predictors 5 , but their accuracy still falls far short of well-studied gene catalogues such as the FlyBase annotation, which combines computational gene prediction 37 , high-throughput experimental data [38][39][40][41][42] and extensive manual curation 23 . Recognizing this, we set out not only to produce an independent computational annotation of protein-coding genes in the fly genome, but also to assess and refine its already high-quality annotations 43 .…”
Section: Revisiting the Protein-coding Gene Cataloguementioning
confidence: 99%