BackgroundSeveral studies have mined short-read RNA sequencing datasets to identify long non-coding RNAs (lncRNAs), and others have focused on the function of individual lncRNAs in abiotic stress response. However, our understanding of the complement, function and origin of lncRNAs – and especially transposon derived lncRNAs (TE-lncRNAs) - in response to abiotic stress is still in its infancy.ResultsWe utilized a dataset of 127 RNA sequencing samples that included total RNA datasets and PacBio fl-cDNA data to discover lncRNAs in maize. Overall, we identified 23,309 candidate lncRNAs from polyA+ and total RNA samples, with a strong discovery bias within total RNA. The majority (65%) of the 23,309 lncRNAs had sequence similarity to transposable elements (TEs). Most had similarity to long-terminal-repeat retrotransposons from the Copia and Gypsy superfamilies, reflecting a high proportion of these elements in the genome. However, DNA transposons were enriched for lncRNAs relative to their genomic representation by ~ 2-fold. By assessing the fraction of lncRNAs that respond to abiotic stresses like heat, cold, salt and drought, we identified 1077 differentially expressed lncRNA transcripts, including 509 TE-lncRNAs. In general, the expression of these lncRNAs was significantly correlated with their nearest gene. By inferring co-expression networks across our large dataset, we found that 39 lncRNAs are as major hubs in co-expression networks that respond to abiotic stress, and 18 appear to be derived from TEs.ConclusionsOur results show that lncRNAs are enriched in total RNA samples, that most (65%) are derived from TEs, that at least 1077 are differentially expressed during abiotic stress, and that 39 are hubs in co-expression networks, including a small number that are evolutionary conserved. These results suggest that lncRNAs, including TE-lncRNAs, may play key regulatory roles in moderating abiotic responses.
Gene body methylation (gbM) is an epigenetic mark where gene exons are methylated in the CG context only, as opposed to CHG and CHH contexts (where H stands for A, C or T). CG methylation is transmitted transgenerationally in plants, opening the possibility that gbM may be shaped by adaptation. This presupposes, however, that gbM has a function that affects phenotype, which has been a topic of debate in the literature. Here we review our current knowledge of gbM in plants. We start by presenting the well elucidated mechanisms of plant gbM establishment and maintenance. We then review more controversial topics: the evolution of gbM and the potential selective pressures that act on it. Finally, we discuss the potential functions of gbM that may affect organismal phenotypes: gene expression stabilization and upregulation, inhibition of aberrant transcription (reverse and internal), prevention of aberrant intron retention and protection against TE insertions. To bolster the review of these topics, we include novel analyses to assess the effect of gbM on transcripts. Overall, a growing body of literature finds that gbM correlates with levels and patterns of gene expression. It is not clear, however, if this is a causal relationship. Altogether, functional work suggests that the effects of gbM, if any, must be relatively small, but there is nonetheless evidence that it is shaped by natural selection. We conclude by discussing the potential adaptive character of gbM and its implications for an updated view of the mechanisms of adaptation in plants.
mCHH islands are peaks of CHH methylation that occur primarily upstream to genes. These regions are actively targeted by the methylation machinery, occur at boundaries between heterochromatin and euchromatin, and tend to be near highly expressed genes. Here we took an evolutionary perspective by studying upstream mCHH islands across a sample of eight grass species. Using a statistical approach to define mCHH islands as regions that differ from genome-wide background CHH methylation levels, we demonstrated that mCHH islands are common and associate with 39% of genes, on average. We hypothesized that islands should be more frequent in genomes of large size, because they have more heterochromatin and hence more need for defined boundaries. We found, however, that smaller genomes tended to have a higher proportion of genes associated with 5’ mCHH islands. Consistent with previous work suggesting that islands reflect the silencing of the edge of transposable elements (TEs), genes with nearby TEs were more likely to have mCHH islands. However, the presence of mCHH islands was not a function solely of TEs, both because the underlying sequences of islands were often not homologous to TEs and because genic properties also predicted the presence of 5’ mCHH islands. These genic properties included length and gene-body methylation (gbM); in fact, in three of eight species the absence of gbM was a stronger predictor of a 5’ mCHH island than TE proximity. In contrast, gene expression level was a positive but weak predictor of the presence of an island. Finally, we assessed whether mCHH islands were evolutionarily conserved by focusing on a set of 2,720 orthologs across the eight species. They were generally not conserved across evolutionary time. Overall, our data establishes additional genic properties that are associated with mCHH islands and suggests that they are not just a consequence of the TE silencing machinery.
Apomixis, or asexual seed formation is prevalent in the Citrinae via a mechanism termed nucellar or adventitious embryony. Here, multiple embryos of a maternal genotype form directly from nucellar cells in the ovule and can outcompete the developing zygotic embryo as they utilize the sexually derived endosperm for growth. Whilst nucellar embryony enables the propagation of clonal plants of maternal genetic constitution, it is also a barrier to effective breeding through hybridization. To address the genetics and evolution of apomixis in the Citrinae, a chromosome-level genome of Hongkong kumquat (Fortunella hindsii) was assembled following a genome-wide variation map including structural variants (SVs) based on 234 Citrinae accessions. This map revealed that hybrid citrus cultivars shelter genome-wide deleterious mutations and SVs into heterozygous states free from recessive selection, which may explain the capability of nucellar embryony in most cultivars during Citrinae diversification. Analyses revealed that parallel evolution may explain the repeated origin of apomixis in different genera of Citrinae. Within Fortunella, we found that apomixis of some varieties originated via introgression. In apomictic Fortunella, the locus associated with apomixis contains the FhRWP gene, encoding an RWP-RK domain-containing protein previously shown to be required for nucellar embryogenesis in Citrus. We found the heterozygous SV in the FhRWP and CitRWP promoters from apomictic Citrus or Fortunella due to either two or three Miniature inverted transposon element (MITE) insertions. A transcription factor FhARID, encoding an AT-rich interaction domain-containing protein binds to the MITEs in the promoter of apomictic varieties which facilitates induction of nucellar embryogenesis. This study provides evolutionary genomic and molecular insights into apomixis in Citrinae and has potential ramifications for citrus breeding.
A subset of genes in plant genomes are labeled with DNA methylation specifically at CG residues. These genes, known as gene-body methylated (gbM), have a number of associated characteristics. They tend to have longer sequences, to be enriched for intermediate expression levels, and to be associated with slower rates of molecular evolution. Most importantly, gbM
The organization of chromatin into self-interacting domains is universal among eukaryotic genomes, though how and why they form varies considerably. Here we report a chromosome-scale reference genome assembly of pepper (Capsicum annuum) and explore its 3D organization through integrating high-resolution Hi-C maps with epigenomic, transcriptomic, and genetic variation data. Chromatin folding domains in pepper are as prominent as TADs in mammals but exhibit unique characteristics. They tend to coincide with heterochromatic regions enriched with retrotransposons and are frequently embedded in loops, which may correlate with transcription factories. Their boundaries are hotspots for chromosome rearrangements but are otherwise depleted for genetic variation. While chromatin conformation broadly affects transcription variance, it does not predict differential gene expression between tissues. Our results suggest that pepper genome organization is explained by a model of heterochromatin-driven folding promoted by transcription factories and that such spatial architecture is under structural and functional constraints.
Background Despite marked recent improvements in long-read sequencing technology, the assembly of diploid genomes remains a difficult task. A major obstacle is distinguishing between alternative contigs that represent highly heterozygous regions. If primary and secondary contigs are not properly identified, the primary assembly will overrepresent both the size and complexity of the genome, which complicates downstream analysis such as scaffolding. Results Here we illustrate a new method, which we call HapSolo, that identifies secondary contigs and defines a primary assembly based on multiple pairwise contig alignment metrics. HapSolo evaluates candidate primary assemblies using BUSCO scores and then distinguishes among candidate assemblies using a cost function. The cost function can be defined by the user but by default considers the number of missing, duplicated and single BUSCO genes within the assembly. HapSolo performs hill climbing to minimize cost over thousands of candidate assemblies. We illustrate the performance of HapSolo on genome data from three species: the Chardonnay grape (Vitis vinifera), with a genome of 490 Mb, a mosquito (Anopheles funestus; 200 Mb) and the Thorny Skate (Amblyraja radiata; 2650 Mb). Conclusions HapSolo rapidly identified candidate assemblies that yield improvements in assembly metrics, including decreased genome size and improved N50 scores. Contig N50 scores improved by 35%, 9% and 9% for Chardonnay, mosquito and the thorny skate, respectively, relative to unreduced primary assemblies. The benefits of HapSolo were amplified by down-stream analyses, which we illustrated by scaffolding with Hi-C data. We found, for example, that prior to the application of HapSolo, only 52% of the Chardonnay genome was captured in the largest 19 scaffolds, corresponding to the number of chromosomes. After the application of HapSolo, this value increased to ~ 84%. The improvements for the mosquito’s largest three scaffolds, representing the number of chromosomes, were from 61 to 86%, and the improvement was even more pronounced for thorny skate. We compared the scaffolding results to assemblies that were based on PurgeDups for identifying secondary contigs, with generally superior results for HapSolo.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.