RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.
We introduce RNA G-quadruplex sequencing (rG4-seq), a transcriptome-wide RNA G-quadruplex (rG4) profiling method that couples rG4-mediated reverse transcriptase stalling with next-generation sequencing. Using rG4-seq on polyadenylated-enriched HeLa RNA, we generated a global in vitro map of thousands of canonical and noncanonical rG4 structures. We characterize rG4 formation relative to cytosine content and alternative RNA structure stability, uncover rG4-dependent differences in RNA folding and show evolutionarily conserved enrichment in transcripts mediating RNA processing and stability.
Guanine (G)-rich sequences in nucleic acids can assemble into G-quadruplex structures that involve G-quartets linked by loop nucleotides. The structural and topological diversity of G-quadruplexes have attracted great attention for decades. Recent methodological advances have advanced the identification and characterization of G-quadruplexes in vivo as well as in vitro, and at a much higher resolution and throughput, which has greatly expanded our current understanding of G-quadruplex structure and function. Accumulating knowledge about the structural properties of G-quadruplexes has helped to design and develop a repertoire of molecular and chemical tools for biological applications. This review highlights how these exciting methods and findings have opened new doors to investigate the potential functions and applications of G-quadruplexes in basic and applied biosciences.
The structural flexibility of RNA underlies fundamental biological processes, but there are no methods for exploring the multiple conformations adopted by RNAs in vivo. We developed cross-linking of matched RNAs and deep sequencing (COMRADES) for in-depth RNA conformation capture, and a pipeline for the retrieval of RNA structural ensembles. Using COMRADES, we determined the architecture of the Zika virus RNA genome inside cells, and identified multiple site-specific interactions with human noncoding RNAs.
RNA structure plays important roles in diverse biological processes. However, the structures of all but the few most abundant RNAs are presently unknown in vivo. Here we introduce DMS/SHAPE-LMPCR to query the in vivo structures of low-abundance transcripts. DMS/ SHAPE-LMPCR achieves attomole sensitivity, a 100,000-fold improvement over conventional methods. We probe the structure of low-abundance U12 small nuclear RNA (snRNA) in Arabidopsis thaliana and provide in vivo evidence supporting our derived phylogenetic structure. Interestingly, in contrast to mammalian U12 snRNAs, the loop of the SLIIb in U12 snRNA is variable among plant species, and DMS/SHAPE-LMPCR determines it to be unstructured. We reveal the effects of proteins on 25S rRNA, 5.8S rRNA and U12 snRNA structure, illustrating the critical importance of mapping RNA structure in vivo. Our universally applicable method opens the door to identifying and exploring the specific structure-function relationships of the multitude of low-abundance RNAs that prevail in living cells.
Background: Guanine-rich sequences are able to form complex RNA structures termed RNA G-quadruplexes in vitro. Because of their high stability, RNA Gquadruplexes are proposed to exist in vivo and are suggested to be associated with important biological relevance. However, there is a lack of direct evidence for RNA Gquadruplex formation in living eukaryotic cells. Therefore, it is unclear whether any purported functions are associated with the specific sequence content or the formation of an RNA G-quadruplex structure. Results: Using rG4-seq, we profile the landscape of those guanine-rich regions with the in vitro folding potential in the Arabidopsis transcriptome. We find a global enrichment of RNA G-quadruplexes with two G-quartets whereby the folding potential is strongly influenced by RNA secondary structures. Using in vitro and in vivo RNA chemical structure profiling, we determine that hundreds of RNA Gquadruplex structures are strongly folded in both Arabidopsis and rice, providing direct evidence of RNA G-quadruplex formation in living eukaryotic cells. Subsequent genetic and biochemical analyses show that RNA G-quadruplex folding is able to regulate translation and modulate plant growth. Conclusions: Our study reveals the existence of RNA G-quadruplex in vivo and indicates that RNA G-quadruplex structures act as important regulators of plant development and growth.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.