Despite a rapid expansion in the number of known RNA viruses following the advent of metagenomic sequencing, the identification and annotation of highly divergent RNA viruses remains challenging, particularly from poorly characterized hosts and environmental samples. Protein structures are more conserved than primary sequence data, such that structure-based comparisons provide an opportunity to reveal the viral “dusk matter”: viral sequences with low, but detectable, levels of sequence identity to known viruses with available protein structures. Here, we present a new open computational and resource – RdRp-scan – that contains a standardized bioinformatic toolkit to identify and annotate divergent RNA viruses in metagenomic sequence data based on the detection of RNA dependent RNA polymerase (RdRp) sequences. By combining RdRp-specific Hidden Markov models (HMM) and structural comparisons we show that RdRp-scan can efficiently detect RdRp sequences with identity levels as low as 10% to those from known viruses and not identifiable using standard sequence-to-sequence comparisons. In addition, to facilitate the annotation and placement of newly detected and divergent virus-like sequences into the known diversity of RNA viruses, RdRp-scan provides new custom and curated databases of viral RdRp sequences and core motif, as well as pre-built RdRp alignments. In parallel, our analysis of the sequence diversity detected by RdRp-scan revealed that while most of the taxonomically unassigned RdRps fell into pre-established clusters, some sequences cluster into potential new orders of RNA viruses related to the Wolframvirales and Tolivirales. Finally, a survey of the conserved A, B and C RdRp motifs within the RdRp-scan sequence database revealed additional variations of both sequence and position, which might provide new insights into the structure, function and evolution of viral RdRps.
Despite a rapid expansion in the number of documented viruses following the advent of metagenomic sequencing, the identification and annotation of highly divergent RNA viruses remains challenging, particularly from poorly characterized hosts and environmental samples. Protein structures are more conserved than primary sequence data, such that structure-based comparisons provide an opportunity to reveal the viral “dusk matter”: viral sequences with low, but detectable, levels of sequence identity to known viruses with available protein structures. Here, we present a new open computational and resource – RdRp-scan – that contains a standardized bioinformatic toolkit to identify and annotate divergent RNA viruses in metagenomic sequence data based on the detection of RNA dependent RNA polymerase (RdRp) sequences. By combining RdRp-specific Hidden Markov models (HMM) and structural comparisons we show that RdRp-scan can efficiently detect RdRp sequences with identity levels as low as 10% to those from known viruses and not identifiable using standard sequence-to-sequence comparisons. In addition, to facilitate the annotation and placement of newly detected and divergent virus-like sequences into the diversity of RNA viruses, RdRp-scan provides new custom and curated databases of viral RdRp sequences and core motifs, as well as pre-built RdRp multiple sequence alignments. In parallel, our analysis of the sequence diversity detected by RdRp-scan revealed that while most of the taxonomically unassigned RdRps fell into pre-established clusters, with some falling into potentially new orders of RNA viruses related to the Wolframvirales and Tolivirales. Finally, a survey of the conserved A, B and C RdRp motifs within the RdRp-scan sequence database revealed additional variations of both sequence and position that might provide new insights into the structure, function and evolution of viral polymerases.
The RNA virus phylum Lenarviricota is comprised of the fungi-associated families Narnaviridae and Mitoviridae, the RNA bacteriophage Leviviridae, and the plant and fungi-associated Botourmiaviridae. Members of the Lenarviricota are abundant in most environments and boast remarkable phylogenetic and genomic diversity. As this phylum includes both RNA bacteriophage and fungi-and plant-associated species, the Lenarviricota likely mark a major evolutionary transition between those RNA viruses associated with prokaryotes and eukaryotes. Despite the remarkable expansion of this phylum following metagenomic studies, the phylogenetic relationships among the families within the Lenarviricota remain uncertain. Utilising a large data set of relevant viral sequences, we performed phylogenetic and genomic analyses to resolve the complex evolutionary history within this phylum and identify patterns in the evolution of virus genome organisation. Despite limitations reflecting very high levels of sequence diversity, our phylogenetic analyses suggests that the Leviviridae comprise the basal lineage within the Lenarviricota. Our phylogenetic results also support the construction of a new virus family – the Narliviridae – comprising a set of diverse and phylogenetically distinct species, including a number of uniquely encapsidated viruses. We propose a taxonomic restructuring within the Lenarviricota to better reflect the phylogenetic relationships documented here, with the Botourmiaviridae and Narliviridae combined into the order Ourlivirales, the Narnaviridae remaining in the order Wolframvirales, and these orders combined into the single class, the Amabiliviricetes. In sum, this study provides insights into the complex evolutionary relationships among the diverse families that make up the Lenarviricota.
In April 2023, following the annual International Committee on Taxonomy of Viruses (ICTV) ratification vote on newly proposed taxa, the phylum Negarnaviricota was amended and emended. The phylum was expanded by one new family, 14 new genera, and 140 new species. Two genera and 538 species were renamed. One species was moved, and four were abolished. This article presents the updated taxonomy of Negarnaviricota as now accepted by the ICTV.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.