Mitochondrial intron patterns are highly divergent between the major land plant clades. An intron in the atp1 gene, atp1i361g2, is an example for a group II intron specific to monilophytes (ferns). Here, we report that atp1i361g2 is lost independently at least 4 times in the fern family Pteridaceae. Such plant organelle intron losses have previously been found to be accompanied by loss of RNA editing sites in the flanking exon regions as a consequence of genomic recombination of mature cDNA. Instead, we now observe that RNA editing events in both directions of pyrimidine exchange (C-to-U and U-to-C) are retained in atp1 exons after loss of the intron in Pteris argyraea/biaurita and in Actiniopteris and Onychium. We find that atp1i361g2 has significant similarity with intron rps3i249g2 present in lycophytes and gymnosperms, which we now also find highly conserved in ferns. We conclude that atp1i361g2 may have originated from the more ancestral rps3i249g2 paralogue by a reverse splicing copy event early in the evolution of monilophytes. Secondary structure elements of the two introns, most characteristically their domains III, show strikingly convergent evolution in the monilophytes. Moreover, the intron paralogue rps3i249g2 reveals relaxed evolution in taxa where the atp1i361g2 paralogue is lost. Our findings may reflect convergent evolution of the two related mitochondrial introns exerted by co-evolution with an intron-binding protein simultaneously acting on the two paralogues.
The occurrence of group II introns in plant mitochondrial genomes is strikingly different between the six major land plant clades, contrasting their highly conserved counterparts in chloroplast DNA. Their present distribution likely reflects numerous ancient intron gains and losses during early plant evolution before the emergence of seed plants. As a novelty for plant organelles, we here report on five cases of twintrons, introns-within-introns, in the mitogenomes of lycophytes and hornworts. An internal group II intron interrupts an intron-borne maturase of an atp9 intron in Lycopodiaceae, whose splicing precedes splicing of the external intron. An invasive, hypermobile group II intron in cox1, has conquered nine further locations including a previously overlooked sdh3 intron and, most surprisingly, also itself. In those cases, splicing of the external introns does not depend on splicing of the internal introns. Similar cases are identified in the mtDNAs of hornworts. Although disrupting a group I intron-encoded protein in one case, we could not detect splicing of the internal group II intron in this ‘mixed’ group I/II twintron. We suggest the name ‘zombie’ twintrons (half-dead, half-alive) for such cases where splicing of external introns does not depend any more on prior splicing of fossilized internal introns.
Plant mitochondrial genomes can be complex owing to highly recombinant structures, lack of gene syntenies, heavy RNA editing and invasion of chloroplast, nuclear or even foreign DNA by horizontal gene transfer (HGT). Leptosporangiate ferns remained the last major plant clade without an assembled mitogenome, likely owing to a demanding combination of the above. We here present both organelle genomes now for Haplopteris ensiformis. More than 1,400 events of C-to-U RNA editing and over 500 events of reverse U-to-C edits affect its organelle transcriptomes. The Haplopteris mtDNA is gene-rich, lacking only the ccm gene suite present in ancestral land plant mitogenomes, but is highly unorthodox, indicating extraordinary recombinogenic activity. Although eleven group II introns known in disrupted trans-splicing states in seed plants exist in conventional cis-arrangements, a particularly complex structure is found for the mitochondrial rrnL gene, which is split into two parts needing reassembly on RNA level by a trans-splicing group I intron. Aside from ca. 80 chloroplast DNA inserts that complicated the mitogenome assembly, the Haplopteris mtDNA features as an idiosyncrasy 30 variably degenerated protein coding regions from Rickettiales bacteria indicative of heavy bacterial HGT on top of tRNA genes of chlamydial origin.
Group II introns are common in the two endosymbiotic organelle genomes of the plant lineage. Chloroplasts harbor 22 positionally conserved group II introns whereas their occurrence in land plant (embryophyte) mitogenomes is highly variable and specific for the seven major clades: liverworts, mosses, hornworts, lycophytes, ferns, gymnosperms and flowering plants. Each plant group features “signature selections” of ca. 20–30 paralogues from a superset of altogether 105 group II introns meantime identified in embryophyte mtDNAs, suggesting massive intron gains and losses along the backbone of plant phylogeny. We report on systematically categorizing plant mitochondrial group II introns into “families”, comprising evidently related paralogues at different insertion sites, which may even be more similar than their respective orthologues in phylogenetically distant taxa. Including streptophyte (charophyte) algae extends our sampling to 161 and we sort 104 streptophyte mitochondrial group II introns into 25 core families of related paralogues evidently arising from retrotransposition events. Adding to discoveries of only recently created intron paralogues, hypermobile introns and twintrons, our survey led to further discoveries including previously overlooked “fossil” introns in spacer regions or e.g., in the rps8 pseudogene of lycophytes. Initially excluding intron-borne maturase sequences for family categorization, we added an independent analysis of maturase phylogenies and find a surprising incongruence between intron mobility and the presence of intron-borne maturases. Intriguingly, however, we find that several examples of nuclear splicing factors meantime characterized simultaneously facilitate splicing of independent paralogues now placed into the same intron families. Altogether this suggests that plant group II intron mobility, in contrast to their bacterial counterparts, is not intimately linked to intron-encoded maturases.
Elucidating the relationship between the sequences of non-coding regulatory elements and their target genes is key to understanding gene regulation and its variation between plant species and ecotypes. In this study, we developed deep learning models that link gene sequence data with mRNA copy number for the plant species Arabidopsis thaliana, Sorghum bicolor, Solanum lycopersicum and Zea mays, and predicted the regulatory effect of gene sequence variation. Our models achieved over 80% accuracy in the species-specific and multi-species prediction tasks and enabled predictive feature selection within the input regulatory sequences. Saliency scores of the model highlighted a set of expression-predictive sequence features and the profound importance of the UTR regions in determining the level of gene expression. Identified sequence features exhibited remarkable conservation across plant species and achieved more than 70% accuracy in cross-species expression prediction. We demonstrated the application of our model on 14 newly assembled tomato genomes, where the effect of structural genetic variation on gene expression is annotated. Finally, we showed that by providing an accurate prediction of differences in the expression of biosynthetic enzymes and their individual homologs, the model highlights known metabolic differences between related genotypes. This was demonstrated for biosynthetic pathways of stress-related compounds in Solanum lycopersicum and its wild drought-resistant relative Solanum pennellii.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.