2021
DOI: 10.1016/j.patter.2021.100274
|View full text |Cite
|
Sign up to set email alerts
|

RaFAH: Host prediction for viruses of Bacteria and Archaea based on protein content

Abstract: Culture-independent approaches have recently shed light on the genomic diversity of viruses of prokaryotes. One fundamental question when trying to understand their ecological roles is: which host do they infect? To tackle this issue we developed a machine-learning approach named Random Forest Assignment of Hosts (RaFAH), that uses scores to 43,644 protein clusters to assign hosts to complete or fragmented genomes of viruses of Archaea and Bacteria. RaFAH displayed performance comparable with that of other met… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
77
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 62 publications
(80 citation statements)
references
References 62 publications
2
77
0
Order By: Relevance
“…Phage-encoded tRNAs also enable phages to optimize protein synthesis in a host with different codon usage and thus serve as both a marker of increased host range and evolution through different hosts ( 95 ). Indeed, tRNAs are often used for computational host prediction of phages due to high sequence conservation of the gene between host and viral forms ( 96 , 97 ). We postulated that the host range of Melnitz would be evident in the four tRNAs found within its genome and reflect potential hosts in the OM43 clade.…”
Section: Resultsmentioning
confidence: 99%
“…Phage-encoded tRNAs also enable phages to optimize protein synthesis in a host with different codon usage and thus serve as both a marker of increased host range and evolution through different hosts ( 95 ). Indeed, tRNAs are often used for computational host prediction of phages due to high sequence conservation of the gene between host and viral forms ( 96 , 97 ). We postulated that the host range of Melnitz would be evident in the four tRNAs found within its genome and reflect potential hosts in the OM43 clade.…”
Section: Resultsmentioning
confidence: 99%
“…The potential hosts of HMO-2011-MVGs were predicted using RaFAH tool with default settings [ 55 ]. The training and validating random forest model for RaFAH was built with 4269 host-known phages, including 11 HMO-2011-type phages and 4258 bacteriophage genomes downloaded from the NCBI RefSeq (v208).…”
Section: Methodsmentioning
confidence: 99%
“…We compared our tools with several state-of-the-art tools: WIsH [ 22 ], PHP [ 12 ], HoPhage [ 24 ], VPF-Class [ 21 ], VHM-net [ 14 ], vHULK [ 25 ], and RaFAH [ 23 ]. We also recorded the output of BLASTN to show the performance of the alignment-based tool.…”
Section: Resultsmentioning
confidence: 99%
“…This model integrates CRISPR, score of WIsH, and BLASTN results and applies Random Markov field to generate predictions. A more recently published model, RaFAH [ 23 ] uses alignment features to construct a random forest for host prediction.…”
Section: Introductionmentioning
confidence: 99%