2003
DOI: 10.1186/1471-2105-4-21
|View full text |Cite
|
Sign up to set email alerts
|

EasyGene – a prokaryotic gene finder that ranks ORFs by statistical significance

Abstract: Background: Contrary to other areas of sequence analysis, a measure of statistical significance of a putative gene has not been devised to help in discriminating real genes from the masses of random Open Reading Frames (ORFs) in prokaryotic genomes. Therefore, many genomes have too many short ORFs annotated as genes.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

4
60
0

Year Published

2004
2004
2015
2015

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 137 publications
(64 citation statements)
references
References 33 publications
(38 reference statements)
4
60
0
Order By: Relevance
“…For prokaryotes, algorithms that focused almost entirely on the prediction stage of the gene identification problem (i.e., how to find genes given a training sequence) evolved into automatic self-training systems capable of adjusting to genome-specific properties in the process of estimating algorithm parameters from anonymous sequence (Audic and Claverie 1998;Hayes and Borodovsky 1998;Salzberg et al 1998;Besemer et al 2001;Larsen and Krogh 2003). A recent dramatic increase in the number of eukaryotic targets of genome sequencing has necessitated the development of unsupervised ab initio gene prediction algorithms for eukaryotes.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For prokaryotes, algorithms that focused almost entirely on the prediction stage of the gene identification problem (i.e., how to find genes given a training sequence) evolved into automatic self-training systems capable of adjusting to genome-specific properties in the process of estimating algorithm parameters from anonymous sequence (Audic and Claverie 1998;Hayes and Borodovsky 1998;Salzberg et al 1998;Besemer et al 2001;Larsen and Krogh 2003). A recent dramatic increase in the number of eukaryotic targets of genome sequencing has necessitated the development of unsupervised ab initio gene prediction algorithms for eukaryotes.…”
Section: Discussionmentioning
confidence: 99%
“…We demonstrated that unsupervised model training, well known in prokaryotic gene finding (Audic and Claverie 1998;Hayes and Borodovsky 1998;Salzberg et al 1998;Besemer et al 2001;Larsen and Krogh 2003;Delcher et al 2007), is also feasible for eukaryotes. Particularly, for genomes of Arabidopsis thaliana, Drosophia melanogaster, and Caenorhabditis elegans, the accuracy of the gene finder with unsupervised parameter estimation matched the accuracy of a conventional supervised gene finder.…”
mentioning
confidence: 96%
“…The coding percentage for both strains was ϳ86.8%; UJ308A and UJ816A contained approximately 4,720 and 4,710 protein coding sequences with average lengths of 869 and 873 bp, respectively. The data were further validated by Glimmer (7) and EasyGene (12). RNAmmer (11) revealed that the genome of UJ308A has 78 tRNA and 21 rRNA genes and the genome of UJ816A contains 77 tRNA and 22 rRNA genes.…”
mentioning
confidence: 99%
“…Coding sequences (CDSs) were determined by using the EasyGene (20) and Prodigal (21) programs, followed by a comparison of their outputs and manual resolution of their discrepancies based on the presence of potential ribosomal binding sites, similarity searches, multiple-sequence alignments, and data reported previously. Gene functions were assigned based on a comparison of the outputs of two annotation tools: RAST (22) and PANNZER (23).…”
Section: Methodsmentioning
confidence: 99%