Endogenous viral sequences are essentially ‘fossil records’ that can sometimes reveal the genomic features of long extinct virus species. Although numerous known instances exist of single-stranded DNA (ssDNA) genomes becoming stably integrated within the genomes of bacteria and animals, there remain very few examples of such integration events in plants. The best studied of these events are those which yielded the geminivirus-related DNA elements found within the nuclear genomes of various Nicotiana species. Although other ssDNA virus-like sequences are included within the draft genomes of various plant species, it is not entirely certain that these are not contaminants. The Nicotiana geminivirus-related DNA elements therefore remain the only definitively proven instances of endogenous plant ssDNA virus sequences. Here, we characterize two new classes of endogenous plant virus sequence that are also apparently derived from ancient geminiviruses in the genus Begomovirus. These two endogenous geminivirus-like elements (EGV1 and EGV2) are present in the Dioscorea spp. of the Enantiophyllum clade. We used fluorescence in situ hybridization to confirm that the EGV1 sequences are integrated in the D. alata genome and showed that one or two ancestral EGV sequences likely became integrated more than 1.4 million years ago during or before the diversification of the Asian and African Enantiophyllum Dioscorea spp. Unexpectedly, we found evidence of natural selection actively favouring the maintenance of EGV-expressed replication-associated protein (Rep) amino acid sequences, which clearly indicates that functional EGV Rep proteins were probably expressed for prolonged periods following endogenization. Further, the detection in D. alata of EGV gene transcripts, small 21–24 nt RNAs that are apparently derived from these transcripts, and expressed Rep proteins, provides evidence that some EGV genes are possibly still functionally expressed in at least some of the Enantiophyllum clade species.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.