Bacterial retrons consist of a reverse transcriptase (RT) and a contiguous non-coding RNA (ncRNA) gene. One third of annotated retrons carry additional open reading frames (ORFs), the contribution and significance of which in retron biology remains to be determined. In this study we developed a computational pipeline for the systematic prediction of genes specifically associated with retron RTs based on a previously reported large dataset representative of the diversity of prokaryotic RTs. We found that retrons generally comprise a tripartite system composed of the ncRNA, the RT and an additional protein or RT-fused domain with diverse enzymatic functions. These retron systems are highly modular, and their components have coevolved to different extents. Based on the additional module, we classified retrons into 13 types, some of which include additional variants. Our findings provide a basis for future studies on the biological function of retrons and for expanding their biotechnological applications.
Most bacteria and archaea possess multiple antiviral defence systems that protect against infection by phages, archaeal viruses and mobile genetic elements. Our understanding of the diversity of defence systems has increased greatly in the last few years, and many more systems likely await discovery. To identify defence-related genes, we recently developed the Prokaryotic Antiviral Defence LOCator (PADLOC) bioinformatics tool. To increase the accessibility of PADLOC, we describe here the PADLOC web server (freely available at https://padloc.otago.ac.nz), allowing users to analyse whole genomes, metagenomic contigs, plasmids, phages and archaeal viruses. The web server includes a more than 5-fold increase in defence system types detected (since the first release) and expanded functionality enabling detection of CRISPR arrays and retron ncRNAs. Here, we provide user information such as input options, description of the multiple outputs, limitations and considerations for interpretation of the results, and guidance for subsequent analyses. The PADLOC web server also houses a precomputed database of the defence systems in > 230,000 RefSeq genomes. These data reveal two taxa, Campylobacterota and Spriochaetota, with unusual defence system diversity and abundance. Overall, the PADLOC web server provides a convenient and accessible resource for the detection of antiviral defence systems.
Prokaryotic genomes harbour a plethora of uncharacterized reverse transcriptases (RTs). RTs phylogenetically related to those encoded by group-II introns have been found associated with type III CRISPR-Cas systems, adjacent or fused at the C-terminus to Cas1. It is thought that these RTs may have a relevant function in the CRISPR immune response mediating spacer acquisition from RNA molecules. The origin and relationships of these RTs and the ways in which the various protein domains evolved remain matters of debate. We carried out a large survey of annotated RTs in databases (198,760 sequences) and constructed a large dataset of unique representative sequences (9,141). The combined phylogenetic reconstruction and identification of the RTs and their various protein domains in the vicinity of CRISPR adaptation and effector modules revealed three different origins for these RTs, consistent with their emergence on multiple occasions: a larger group that have evolved from group-II intron RTs, and two minor lineages that may have arisen more recently from Retron/retron-like sequences and Abi-P2 RTs, the latter associated with type I-C systems. We also identified a particular group of RTs associated with CRISPR-cas loci in clade 12, fused C-terminally to an archaeo-eukaryotic primase (AEP), a protein domain (AE-Prim_S_like) forming a particular family within the AEP proper clade. Together, these data provide new insight into the evolution of CRISPR-Cas/RT systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.