2020
DOI: 10.1093/gbe/evaa211
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking Orthogroup Inference Accuracy: Revisiting Orthobench

Abstract: Orthobench is the standard benchmark to assess the accuracy of orthogroup inference methods. It contains 70 expert curated reference orthogroups (RefOGs) that span the Bilateria and cover a range of different challenges for orthogroup inference. Here we leveraged improvements in tree inference algorithms and computational resources to re-interrogate these RefOGs and carry out an extensive phylogenetic delineation of their composition. This phylogenetic revision altered the membership of 31 of the 70 RefOGs, wi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
22
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 26 publications
(24 citation statements)
references
References 32 publications
0
22
0
Order By: Relevance
“…We have demonstrated that Possvm classifications show very high precision and recall against a notably large multigene family (ANTP homeoboxes) and a curated benchmark of orthology groups ( Trachana et al 2011 ; Emms and Kelly 2020 ). Yet, it is crucial to highlight that Possvm ’s performance depends on the quality of the input gene tree.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…We have demonstrated that Possvm classifications show very high precision and recall against a notably large multigene family (ANTP homeoboxes) and a curated benchmark of orthology groups ( Trachana et al 2011 ; Emms and Kelly 2020 ). Yet, it is crucial to highlight that Possvm ’s performance depends on the quality of the input gene tree.…”
Section: Discussionmentioning
confidence: 99%
“…This more inclusive metric results in higher recall without a detrimental effect on precision ( supplementary material S3, Supplementary Material online). Possvm showed comparably high performance in other data sets, including subsets of insect and vertebrate ANTPs, the PRD and TALE homeobox classes, and 70 manually curated orthogroups from the Orthobench database ( Trachana et al 2011 ; Emms and Kelly 2020 ): in all cases, average precision and recall were above 0.90 ( fig. 2 C and supplementary materials S3 and S4, Supplementary Material online).…”
Section: Benchmarking the Accuracy Of The Orthology Clusteringmentioning
confidence: 96%
“…Because e-values (and their constituent bit-scores) are imperfectly correlated with evolutionary relatedness, the set of similar sequences meeting the search threshold will often be missing some genes as well as often including genes that should not be present. A systematic study using HMMER found that for all n genes from an orthogroup clade to pass an e-value threshold, on average the threshold would have to be set such that 1.8n genes in total met the threshold [21]. i.e.…”
Section: Discussionmentioning
confidence: 99%
“…i.e. an additional 80% of genes needed to be included, on average, to ensure the orthogroup was complete [21]. Thus, unless a very lenient search is used, genes will be incorrectly absent from the final tree.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation