2017
DOI: 10.1186/s12859-017-1529-8
|View full text |Cite
|
Sign up to set email alerts
|

Orthograph: a versatile tool for mapping coding nucleotide sequences to clusters of orthologous genes

Abstract: BackgroundOrthology characterizes genes of different organisms that arose from a single ancestral gene via speciation, in contrast to paralogy, which is assigned to genes that arose via gene duplication. An accurate orthology assignment is a crucial step for comparative genomic studies. Orthologous genes in two organisms can be identified by applying a so-called reciprocal search strategy, given that complete information of the organisms’ gene repertoire is available. In many investigations, however, only a fr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
152
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 137 publications
(152 citation statements)
references
References 30 publications
0
152
0
Order By: Relevance
“…Settings were as follows: ‐long option, the conserved genes in the Endopterygota set ( n = 2442), and ‐sp tribolium2012 as the closest relative. The predicted species‐specific gene models were then used for ab initio gene predictions in augustus , and predicted protein coding sequences were used in Orthograph v. 0.6.1 (Petersen et al , ). Outgroup data were assembled as described in previous studies (Kusy et al , ,b).…”
Section: Methodsmentioning
confidence: 99%
“…Settings were as follows: ‐long option, the conserved genes in the Endopterygota set ( n = 2442), and ‐sp tribolium2012 as the closest relative. The predicted species‐specific gene models were then used for ab initio gene predictions in augustus , and predicted protein coding sequences were used in Orthograph v. 0.6.1 (Petersen et al , ). Outgroup data were assembled as described in previous studies (Kusy et al , ,b).…”
Section: Methodsmentioning
confidence: 99%
“…Each of the 522 genes was then assessed for orthology in Orthograph using a reciprocal blast search, and this was done for each taxon-based result from the Orthograph pipeline. Results were stored in both AA and nucleotide (NT) format for each taxon-based result (following Petersen et al, 2017). The resulting fasta formatted NT files for each species were screened for vector contamination using UniVec (Cochrane & Galperin, 2010).…”
Section: Read Processing Assembly and Orthology Assessmentmentioning
confidence: 99%
“…As a result, we employed a custom bioinformatics pipeline to process these files. First, headers of all files were modified using orthograph2hamstrad.pl (Perl script provided from the Orthograph package; Petersen et al, 2017). Second, reference genes (OGSs) are removed for each file, such that each of them contained only one target sequence for each gene and a clear taxon name (or taxon code) that includes an OrthoDB7 ID.…”
Section: Phylogenomic Pipelinementioning
confidence: 99%
See 2 more Smart Citations