A Biologist's View of the <i>Drosophila</i> Genome Annotation Assessment Project

Ashburner, Michael

doi:10.1101/gr.10.4.391

Cited by 50 publications

(29 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Guided by gene density estimates from the comprehensive analysis of the Adh region (Ashburner et al 1999), Adams et al made a conservative estimate of 13,601 genes. However, in light of the gene density in the annotated genome sequence, Ashburner conceded that their analysis of the Adh region may have been too conservative, which in turn affects the estimate by Adams et al (Adams et al 2000;Ashburner et al 2000). Our study of transcription in the testis clearly indicates the existence of a significant class of undetected genes in the current genome release.…”

Section: How Many Genes?mentioning

confidence: 76%

“…The alcohol dehydrogenase region of Drosophila is a case in point. Gene-calling programs failed to identify some known genes in this region (Ashburner et al 1999;Ashburner 2000;Birney and Durbin 2000;Gaasterland et al 2000;Henikoff and Henikoff 2000;Krogh 2000;Parra et al 2000;Reese et al 2000a;Salamov and Solovyev 2000). Expressed sequence tag (EST) analysis is also an important tool for identifying transcription units but is also subject to errors (Adams et al 1991;Okubo et al 1992;Weinstock et al 1994;Adams et al 1995;Hillier et al 1996;Audic and Claverie 1997;Wolfsberg and Landsman 1997;Rubin et al 2000).…”

mentioning

confidence: 99%

See 1 more Smart Citation

Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis

Andrews

Bouffard

Cheadle³

et al. 2000

Genome Research

180

132

View full text Add to dashboard Cite

Identification and annotation of all the genes in the sequenced Drosophila genome is a work in progress. Wild-type testis function requires many genes and is thus of potentially high value for the identification of transcription units. We therefore undertook a survey of the repertoire of genes expressed in the Drosophila testis by computational and microarray analysis. We generated 3141 high-quality testis expressed sequence tags (ESTs). Testis ESTs computationally collapsed into 1560 cDNA set used for further analysis. Of those, 11% correspond to named genes, and 33% provide biological evidence for a predicted gene. A surprising 47% fail to align with existing ESTs and 16% with predicted genes in the current genome release. EST frequency and microarray expression profiles indicate that the testis mRNA population is highly complex and shows an extended range of transcript abundance. Furthermore, >80% of the genes expressed in the testis showed onefold overexpression relative to ovaries, or gonadectomized flies. Additionally, >3% showed more than threefold overexpression at p <0.05. Surprisingly, 22% of the genes most highly overexpressed in testis match Drosophila genomic sequence, but not predicted genes. These data strongly support the idea that sequencing additional cDNA libraries from defined tissues, such as testis, will be important tools for refined annotation of the Drosophila genome. Additionally, these data suggest that the number of genes in Drosophila will significantly exceed the conservative estimate of 13,601.

show abstract

Section: How Many Genes?mentioning

confidence: 76%

mentioning

confidence: 99%

Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis

Andrews

Bouffard

Cheadle³

et al. 2000

Genome Research

180

132

View full text Add to dashboard Cite

show abstract

“…This is crucial, since this data set is to serve as a basis for most of the postgenomic analysis. In Drosophila, a large-scale experiment has been devised to confront gene prediction programs and experimental approaches (1), based on high-resolution gene mapping in a well-known region of the genome. In Chlamydomo-nas, the early sequencing of a large stretch of genomic DNA has allowed benchmarking of greenGenie and a preliminary assessment of gene content (24).…”

mentioning

confidence: 99%

Chlamydomonas Immunophilins and Parvulins: Survey and Critical Assessment of Gene Models

Vallon

2005

Eukaryot Cell

View full text Add to dashboard Cite

“…] The identification of all expressed genes and the structure(s) of their transcripts are prerequisites for many structural and functional genomic studies. Gene-finding programs are valuable tools for identifying gene structure, but they are errorprone and suffer from the inability to predict untranslated regions (UTRs) (Ashburner 2000;Reese et al 2000). Direct analysis of gene transcripts is the only proven way to establish gene structures with confidence.…”

mentioning

confidence: 99%

The Drosophila Gene Collection: Identification of Putative Full-Length cDNAs for 70% of D. melanogaster Genes

Stapleton¹,

Liao²,

Brokstein³

et al. 2002

Genome Res.

186

165

View full text Add to dashboard Cite

Collections of full-length nonredundant cDNA clones are critical reagents for functional genomics. The first step toward these resources is the generation and single-pass sequencing of cDNA libraries that contain a high proportion of full-length clones. The first release of the Drosophila Gene Collection Release 1 (DGCr1) was produced from six libraries representing various tissues, developmental stages, and the cultured S2 cell line. Nearly 80,000 random 5Ј expressed sequence tags (5Ј expressed sequence tags [ESTs]from these libraries were collapsed into a nonredundant set of 5849 cDNAs, corresponding to ∼40% of the 13,474 predicted genes in Drosophila. To obtain cDNA clones representing the remaining genes, we have generated an additional 157,835 5Ј ESTs from two previously existing and three new libraries. One new library is derived from adult testis, a tissue we previously did not exploit for gene discovery; two new cap-trapped normalized libraries are derived from 0-22-h embryos and adult heads. Taking advantage of the annotated D. melanogaster genome sequence, we clustered the ESTs by aligning them to the genome. Clusters that overlap genes not already represented by cDNA clones in the DGCr1 were analyzed further, and putative full-length clones were selected for inclusion in the new DGC. This second release of the DGC (DGCr2) contains 5061 additional clones, extending the collection to 10,910 cDNAs representing >70% of the predicted genes in Drosophila.[The sequence data described in this paper have been submitted to the GenBank data library under accession nos. BF485518-BF503517, BF503521-BF506780, BG631888-BG631996, BG633696-BG637540, BG640063-BG641469, BI141709-BI142246, BI161485-BI173971, BI212109-BI216987, BI227448-BI233322, BI234009-BI243989, BI351612-BI354228, BI354231-BI355901, BI355935-BI358751, BI361285-BI376197, BI481532-BI487261, BI563331-BI593695, BI604243-BI620155, BI620158-BI635012, BI635064-BI638027, and BI638030-BI642053. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: J. Pringle and M. Fuller.]The identification of all expressed genes and the structure(s) of their transcripts are prerequisites for many structural and functional genomic studies. Gene-finding programs are valuable tools for identifying gene structure, but they are errorprone and suffer from the inability to predict untranslated regions (UTRs) (Ashburner 2000;Reese et al. 2000). Direct analysis of gene transcripts is the only proven way to establish gene structures with confidence. Generating a collection of expressed sequence tags (ESTs) from high quality cDNA libraries is a widely used approach for acquiring this information (Adams et al. 1991). The sequences of ESTs and full-length nonredundant cDNA collections provide ideal tools for genome annotation and for the further training of gene prediction algorithms. Our first D.melanogaster EST project yielded putative full-length clones corresponding to >5000 different genes (Rubin et al. 2000). This was acco...

show abstract

A Biologist's View of the Drosophila Genome Annotation Assessment Project

Cited by 50 publications

References 11 publications

Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis

Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis

Chlamydomonas Immunophilins and Parvulins: Survey and Critical Assessment of Gene Models

The Drosophila Gene Collection: Identification of Putative Full-Length cDNAs for 70% of D. melanogaster Genes

Contact Info

Product

Resources

About