2021
DOI: 10.1093/bioinformatics/btab827
|View full text |Cite
|
Sign up to set email alerts
|

No one tool to rule them all: prokaryotic gene prediction tool annotations are highly dependent on the organism of study

Abstract: Motivation The biases in CoDing Sequence (CDS) prediction tools, which have been based on historic genomic annotations from model organisms, impact our understanding of novel genomes and metagenomes. This hinders the discovery of new genomic information as it results in predictions being biased towards existing knowledge. To date, users have lacked a systematic and replicable approach to identify the strengths and weaknesses of any CoDing Sequence (CDS) prediction tool and allow them to choos… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
65
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 25 publications
(66 citation statements)
references
References 75 publications
(95 reference statements)
1
65
0
Order By: Relevance
“…The concept of a StORF shows that despite all the advancements in computational biology, there are still clear problems in how genome annotation is undertaken. Previous findings (8) confirmed that genome annotation methods in use today such as Prodigal (19) accurately detect the majority of CDS genes (in genomes which use the ‘universal’ codon table). However, in general, while all these tools overpredicted the number of genes, there were still large (>=10-20%) portions of the genomes studied without any annotation.…”
Section: Discussionmentioning
confidence: 55%
See 3 more Smart Citations
“…The concept of a StORF shows that despite all the advancements in computational biology, there are still clear problems in how genome annotation is undertaken. Previous findings (8) confirmed that genome annotation methods in use today such as Prodigal (19) accurately detect the majority of CDS genes (in genomes which use the ‘universal’ codon table). However, in general, while all these tools overpredicted the number of genes, there were still large (>=10-20%) portions of the genomes studied without any annotation.…”
Section: Discussionmentioning
confidence: 55%
“…The ORForise platform reported that Prodigal was able to identify the vast majority of Ensembl genes from each of the MOs, except for M. genitalium (8). The genomic regions containing no CDS predictions were extracted with UR-Extractor and StORFs were reported from these URs using StORF-Finder.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Functional annotations of predicted genes were performed using eggNOG-mapper v2 [7] with default parameters. While in principle alternative gene predictions can impact the subsequent functional annotation, previous empirical investigations found negligible performance variation among different tools [23] , [24] . For this reason, our tests focused on benchmarking functional annotation prediction by using a single state-of-the-art gene prediction tool.…”
Section: Resultsmentioning
confidence: 96%