2020
DOI: 10.1101/2019.12.31.891218
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

GeneMark-EP and -EP+: eukaryotic gene prediction with self-training in the space of genes and proteins

Abstract: We have made several steps towards creating a fast and accurate algorithm for gene prediction in eukaryotic genomes. First, we introduced an automated method for efficient ab initio gene finding, GeneMark-ES, with parameters trained in iterative unsupervised mode. Next, in GeneMark-ET we proposed a method of integration of unsupervised training with information on intron positions revealed by mapping short RNA reads.Now we describe … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
65
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 46 publications
(65 citation statements)
references
References 29 publications
0
65
0
Order By: Relevance
“…First, the ab initio gene finder GeneMark-ES [1] completes self-training on a given genome and delivers predicted genes, the initial set of seed genes. This step is a part of the internal pipeline ProtHint (described earlier [26]) that executes GeneMark-ES, DIAMOND [36] and Spaln [31] (Fig. 1).…”
Section: Description Of Braker2mentioning
confidence: 99%
See 4 more Smart Citations
“…First, the ab initio gene finder GeneMark-ES [1] completes self-training on a given genome and delivers predicted genes, the initial set of seed genes. This step is a part of the internal pipeline ProtHint (described earlier [26]) that executes GeneMark-ES, DIAMOND [36] and Spaln [31] (Fig. 1).…”
Section: Description Of Braker2mentioning
confidence: 99%
“…The connection between translated seed genes (seed proteins) and the genomic loci where the seed genes are residing (seed regions) is used in the subsequent steps. The seed proteins make queries for the DIAMOND database search that identifies potentially homologous (target) proteins in a protein database [26]. BRAKER2 runs in two major iterations (Fig.…”
Section: Description Of Braker2mentioning
confidence: 99%
See 3 more Smart Citations