BackgroundChromoviruses are one of the three genera of Ty3-gypsy long terminal repeat (LTR) retrotransposons, and are present in high copy numbers in plant genomes. They are widely distributed within the plant kingdom, with representatives even in lower plants such as green and red algae. Their hallmark is the presence of a chromodomain at the C-terminus of the integrase. The chromodomain exhibits structural characteristics similar to proteins of the heterochromatin protein 1 (HP1) family, which mediate the binding of each chromovirus type to specific histone variants. A specific integration via the chromodomain has been shown for only a few chromoviruses. However, a detailed study of different chromoviral clades populating a single plant genome has not yet been carried out.ResultsWe conducted a comprehensive survey of chromoviruses within the Beta vulgaris (sugar beet) genome, and found a highly diverse chromovirus population, with significant differences in element size, primarily caused by their flanking LTRs. In total, we identified and annotated full-length members of 16 families belonging to the four plant chromoviral clades: CRM, Tekay, Reina, and Galadriel. The families within each clade are structurally highly conserved; in particular, the position of the chromodomain coding region relative to the polypurine tract is clade-specific. Two distinct groups of chromodomains were identified. The group II chromodomain was present in three chromoviral clades, whereas families of the CRM clade contained a more divergent motif. Physical mapping using representatives of all four clades identified a clade-specific integration pattern. For some chromoviral families, we detected the presence of expressed sequence tags, indicating transcriptional activity.ConclusionsWe present a detailed study of chromoviruses, belonging to the four major clades, which populate a single plant genome. Our results illustrate the diversity and family structure of B. vulgaris chromoviruses, and emphasize the role of chromodomains in the targeted integration of these viruses. We suggest that the diverse sets of plant chromoviruses with their different localization patterns might help to facilitate plant-genome organization in a structural and functional manner.
SUMMARYA large fraction of eukaryotic genomes is made up of long interspersed nuclear elements (LINEs). Due to their capability to create novel copies via error-prone reverse transcription, they generate multiple families and reach high copy numbers. Although mammalian LINEs have been well described, plant LINEs have been only poorly investigated. Here, we present a systematic cross-species survey of LINEs in higher plant genomes shedding light on plant LINE evolution as well as diversity, and facilitating their annotation in genome projects. Applying a Hidden Markov Model (HMM)-based analysis, 59 390 intact LINE reverse transcriptases (RTs) were extracted from 23 plant genomes. These fall in only two out of 28 LINE clades (L1 and RTE) known in eukaryotes. While plant RTE LINEs are highly homogenous and mostly constitute only a single family per genome, plant L1 LINEs are extremely diverse and form numerous families. Despite their heterogeneity, all members across the 23 species fall into only seven L1 subclades, some of them defined here. Exemplarily focusing on the L1 LINEs of a basal reference plant genome (Beta vulgaris), we show that the subclade classification level does not only reflect RT sequence similarity, but also mirrors structural aspects of complete LINE retrotransposons, like element size, position and type of encoded enzymatic domains. Our comprehensive catalogue of plant LINE RTs serves the classification of highly diverse plant LINEs, while the provided subclade-specific HMMs facilitate their annotation.
Supplementary data are available at Bioinformatics online.
Background Extrachromosomal circular DNAs (eccDNAs) are ring-like DNA structures physically separated from the chromosomes with 100 bp to several megabasepairs in size. Apart from carrying tandemly repeated DNA, eccDNAs may also harbor extra copies of genes or recently activated transposable elements. As eccDNAs occur in all eukaryotes investigated so far and likely play roles in stress, cancer, and aging, they have been prime targets in recent research—with their investigation limited by the scarcity of computational tools. Results Here, we present the ECCsplorer, a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing techniques. Following Illumina-sequencing of amplified circular DNA (circSeq), the ECCsplorer enables an easy and automated discovery of eccDNA candidates. The data analysis encompasses two major procedures: first, read mapping to the reference genome allows the detection of informative read distributions including high coverage, discordant mapping, and split reads. Second, reference-free comparison of read clusters from amplified eccDNA against control sample data reveals specifically enriched DNA circles. Both software parts can be run separately or jointly, depending on the individual aim or data availability. To illustrate the wide applicability of our approach, we analyzed semi-artificial and published circSeq data from the model organisms Homo sapiens and Arabidopsis thaliana, and generated circSeq reads from the non-model crop plant Beta vulgaris. We clearly identified eccDNA candidates from all datasets, with and without reference genomes. The ECCsplorer pipeline specifically detected mitochondrial mini-circles and retrotransposon activation, showcasing the ECCsplorer’s sensitivity and specificity. Conclusion The ECCsplorer (available online at https://github.com/crimBubble/ECCsplorer) is a bioinformatics pipeline to detect eccDNAs in any kind of organism or tissue using next-generation sequencing data. The derived eccDNA targets are valuable for a wide range of downstream investigations—from analysis of cancer-related eccDNAs over organelle genomics to identification of active transposable elements.
Showing a high sequence similarity, the evolutionary closely related bacterial poly(A) polymerases (PAP) and CCA-adding enzymes catalyze quite different reactions—PAP adds poly(A) tails to RNA 3′-ends, while CCA-adding enzymes synthesize the sequence CCA at the 3′-terminus of tRNAs. Here, two highly conserved structural elements of the corresponding Escherichia coli enzymes were characterized. The first element is a set of amino acids that was identified in CCA-adding enzymes as a template region determining the enzymes' specificity for CTP and ATP. The same element is also present in PAP, where it confers ATP specificity. The second investigated region corresponds to a flexible loop in CCA-adding enzymes and is involved in the incorporation of the terminal A-residue. Although, PAP seems to carry a similar flexible region, the functional relevance of this element in PAP is not known. The presented results show that the template region has an essential function in both enzymes, while the second element is surprisingly dispensable in PAP. The data support the idea that the bacterial PAP descends from CCA-adding enzymes and still carries some of the structural elements required for CCA-addition as an evolutionary relic and is now fixed in a conformation specific for A-addition.
Long terminal repeat (LTR) retrotransposons are major components of plant genomes influencing genome size and evolution. Using two separate approaches, we identified the Ty1-copia retrotransposon families Cotzilla and SALIRE in the Beta vulgaris (sugar beet) genome. While SALIRE elements are similar to typical Ty1-copia retrotransposons, Cotzilla elements belong to a lineage called Sireviruses. Hallmarks of Cotzilla retrotransposons are the existence of an additional putative env-like open reading frame upstream of the 3'LTR, an extended gag region, and a frameshift separating the gag and pol genes. Detected in a c ( 0 ) t-1 DNA library, Cotzilla elements belong to the most abundant retrotransposon families in B. vulgaris and are relatively homogenous and evolutionarily young. In contrast, the SALIRE family has relatively few copies, is diverged, and most likely ancient. As revealed by fluorescent in situ hybridization, SALIRE elements target predominantly gene-rich euchromatic regions, while Cotzilla retrotransposons are abundant in the intercalary and pericentromeric heterochromatin. The analysis of two retrotransposons from the same subclass contrasting in abundance, age, sequence diversity, and localization gives insight in the heterogeneity of LTR retrotransposons populating a plant genome.
SummaryIf two related plant species hybridize, their genomes may be combined and duplicated within a single nucleus, thereby forming an allotetraploid. How the emerging plant balances two co‐evolved genomes is still a matter of ongoing research. Here, we focus on satellite DNA (satDNA), the fastest turn‐over sequence class in eukaryotes, aiming to trace its emergence, amplification, and loss during plant speciation and allopolyploidization. As a model, we used Chenopodium quinoa Willd. (quinoa), an allopolyploid crop with 2n = 4x = 36 chromosomes. Quinoa originated by hybridization of an unknown female American Chenopodium diploid (AA genome) with an unknown male Old World diploid species (BB genome), dating back 3.3–6.3 million years. Applying short read clustering to quinoa (AABB), C. pallidicaule (AA), and C. suecicum (BB) whole genome shotgun sequences, we classified their repetitive fractions, and identified and characterized seven satDNA families, together with the 5S rDNA model repeat. We show unequal satDNA amplification (two families) and exclusive occurrence (four families) in the AA and BB diploids by read mapping as well as Southern, genomic, and fluorescent in situ hybridization. Whereas the satDNA distributions support C. suecicum as possible parental species, we were able to exclude C. pallidicaule as progenitor due to unique repeat profiles. Using quinoa long reads and scaffolds, we detected only limited evidence of intergenomic homogenization of satDNA after allopolyploidization, but were able to exclude dispersal of 5S rRNA genes between subgenomes. Our results exemplify the complex route of tandem repeat evolution through Chenopodium speciation and allopolyploidization, and may provide sequence targets for the identification of quinoa's progenitors.
Summary LTR retrotransposons and retroviruses are closely related. Although a viral envelope gene is found in some LTR retrotransposons and all retroviruses, only the latter show infectivity. The identification of Ty3‐gypsy‐like retrotransposons possessing putative envelope‐like open reading frames blurred the taxonomical borders and led to the establishment of the Errantivirus, Metavirus and Chromovirus genera within the Metaviridae. Only a few plant Errantiviruses have been described, and their evolutionary history is not well understood. In this study, we investigated 27 retroelements of four abundant Elbe retrotransposon families belonging to the Errantiviruses in Beta vulgaris (sugar beet). Retroelements of the Elbe lineage integrated between 0.02 and 5.59 million years ago, and show family‐specific variations in autonomy and degree of rearrangements: while Elbe3 members are highly fragmented, often truncated and present in a high number of solo LTRs, Elbe2 members are mainly autonomous. We observed extensive reshuffling of structural motifs across families, leading to the formation of new retrotransposon families. Elbe retrotransposons harbor a typical envelope‐like gene, often encoding transmembrane domains. During the course of Elbe evolution, the additional open reading frames have been strongly modified or independently acquired. Taken together, the Elbe lineage serves as retrotransposon model reflecting the various stages in Errantivirus evolution, and allows a detailed analysis of retrotransposon family formation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.