New regulatory roles continue to emerge for both natural and engineered noncoding RNAs, many of which have specific secondary and tertiary structures essential to their function. Thus there is a growing need to develop technologies that enable rapid characterization of structural features within complex RNA populations. We have developed a high-throughput technique, SHAPE-Seq, that can simultaneously measure quantitative, single nucleotide-resolution secondary and tertiary structural information for hundreds of RNA molecules of arbitrary sequence. SHAPE-Seq combines selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) chemistry with multiplexed paired-end deep sequencing of primer extension products. This generates millions of sequencing reads, which are then analyzed using a fully automated data analysis pipeline, based on a rigorous maximum likelihood model of the SHAPE-Seq experiment. We demonstrate the ability of SHAPESeq to accurately infer secondary and tertiary structural information, detect subtle conformational changes due to single nucleotide point mutations, and simultaneously measure the structures of a complex pool of different RNA molecules. SHAPE-Seq thus represents a powerful step toward making the study of RNA secondary and tertiary structures high throughput and accessible to a wide array of scientific pursuits, from fundamental biological investigations to engineering RNA for synthetic biological systems.chemical probing | RNA sequencing | RNA folding | genomics O ver the past several years, there has been an explosion in the discovery of noncoding, but functional RNAs that play central roles in maintaining, regulating, and defending the genome (1). At the same time, RNA-based mechanisms have emerged as powerful tools for engineering synthetic biological systems (2). Many of these natural and synthetic RNAs have specific secondary and tertiary structures essential to their function, and there is a growing need to develop technologies that enable rapid characterization of structural features within complex RNA populations. Such a high-throughput structure characterization assay would allow rapid assessment of the impact of sequence on structure and function and enable RNA engineers to design libraries of RNA molecules with desired structural properties.Two techniques for high-throughput RNA structure characterization have recently been reported: parallel analysis of RNA structures (PARS) (3) and fragmentation sequencing (FragSeq) (4). Both techniques couple classic in vitro nuclease probing techniques that are traditionally performed one RNA at a time, with deep sequencing of RNA fragments to simultaneously probe a complex mixture of RNAs sampled from transcriptomes. Although important first steps, these techniques provide only low-resolution secondary structure information due to the limitations inherent in nuclease probing (5).We have developed a high-throughput technique, SHAPESeq, that can simultaneously measure quantitative, single nucleotide-resolution secondary and tertia...
Sequence census methods reduce molecular measurements such as transcript abundance and protein-nucleic acid interactions to counting problems via DNA sequencing. We focus on a novel assay utilizing this approach, called selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq), that can be used to characterize RNA secondary and tertiary structure. We describe a fully automated data analysis pipeline for SHAPE-Seq analysis that includes read processing, mapping, and structural inference based on a model of the experiment. Our methods rely on the solution of a series of convex optimization problems for which we develop efficient and effective numerical algorithms. Our results can be easily extended to other chemical probes of RNA structure, and also generalized to modeling polymerase drop-off in other sequence census-based experiments. O ver the past 30 years, techniques have been developed that probe RNA structures with small molecules. In this class of techniques, a chemical reagent modifies RNA molecules in a structure-dependent fashion. Depending on the reagent used, four distinct types of information can be gleaned, including spatial nucleotide contact information, solvent accessibility of the RNA backbone, the local electrostatic environment adjacent to each nucleotide, and the local nucleotide flexibility (1). In each of these techniques, the modification location is detected during conversion to cDNA by blockage of reverse transcriptase at the modification site (Fig. 1). The detection can be performed by direct sequencing of the cDNA fragments using high-throughput sequencing technology (2). However, because at most a single modified site is revealed by every sequenced fragment (the closest modification to the 3′ end), a mathematical model and inference framework are needed to accurately infer the underlying structural properties given the observed fragment distribution.In this work, we introduce such a model and framework in the context of the SHAPE (selective 2′-hydroxyl acylation analyzed by primer extension) technique for characterizing local nucleotide flexibility (3-5). The identification of adduct formation can be performed by capillary electrophoresis (SHAPE-CE) or by high-throughput sequencing of cDNA fragments (SHAPESeq) (2) (Fig. 1). Every fragment begins at the 3′ end of the molecule and terminates at some adduct [(+) channel], or possibly at a location where there was natural polymerase drop-off (6, 7), which is controlled for in a separate control experiment [(−) channel]. Following sequencing, reads are mapped back to the RNA sequence and are classified by their end location. The resulting read counts are the sufficient statistics for a model that is used to infer estimates of the probabilities of adduct formation at each nucleotide, called relative reactivities.The probabilistic model we develop for SHAPE and the sequencing that follows in SHAPE-Seq is highly structured and has recursive properties that allow for efficient maximum-likelihood inference and confidenc...
Knowledge of RNA structure is critical to understanding both the important functional roles of RNA in biology and the engineering of RNA to control biological systems. This article contains a protocol for selective 2′‐hydroxyl acylation analyzed by primer extension and sequencing (SHAPE‐Seq) that, through a combination of structure‐dependent chemical probing and next‐generation sequencing technologies, achieves structural characterization of hundreds of RNAs in a single experiment. This protocol is applicable in a variety of conditions, and represents an important tool for understanding RNA biology. The protocol includes methods for the design and synthesis of RNA mixtures for study, and the construction and analysis of structure‐dependent sequencing libraries that reveal structural information of the RNAs in the mixtures. The methods are generally applicable to studying RNA structure and interactions in vitro in a variety of conditions, and allows for the rapid characterization of RNA structures in a high‐throughput manner. Curr. Protoc. Chem. Biol. 4:275‐297 © 2012 by John Wiley & Sons, Inc.
Abstract-Despite great interest in solving RNA secondary structures due to their impact on function, it remains an open problem to determine structure from sequence. Among experimental approaches, a promising candidate is the "chemical modification strategy", which involves application of chemicals to RNA that are sensitive to structure and that result in modifications that can be assayed via sequencing technologies. One approach that can reveal paired nucleotides via chemical modification followed by sequencing is SHAPE, and it has been used in conjunction with capillary electrophoresis (SHAPE-CE) and high-throughput sequencing (SHAPE-Seq). The solution of mathematical inverse problems is needed to relate the sequence data to the modified sites, and a number of approaches have been previously suggested for SHAPE-CE, and separately for SHAPE-Seq analysis.Here we introduce a new model for inference of chemical modification experiments, whose formulation results in closedform maximum likelihood estimates that can be easily applied to data. The model can be specialized to both SHAPE-CE and SHAPE-Seq, and therefore allows for a direct comparison of the two technologies. We then show that the extra information obtained with SHAPE-Seq but not with SHAPE-CE is valuable with respect to ML estimation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.