Methods of artificial evolution such as SELEX and in vitro selection have made it possible to isolate RNA and DNA motifs with a wide range of functions from large random sequence libraries. Once the primary sequence of a functional motif is known, the sequence space around it can be comprehensively explored using a combination of random mutagenesis and selection. However, methods to explore the sequence space of a secondary structure are not as well characterized. Here we address this question by describing a method to construct libraries in a single synthesis which are enriched for sequences with the potential to form a specific secondary structure, such as that of an aptamer, ribozyme, or deoxyribozyme. Although interactions such as base pairs cannot be encoded in a library using conventional DNA synthesizers, it is possible to modulate the probability that two positions will have the potential to pair by biasing the nucleotide composition at these positions. Here we show how to maximize this probability for each of the possible ways to encode a pair (in this study defined as A-U or U-A or C-G or G-C or G.U or U.G). We then use these optimized coding schemes to calculate the number of different variants of model stems and secondary structures expected to occur in a library for a series of structures in which the number of pairs and the extent of conservation of unpaired positions is systematically varied. Our calculations reveal a tradeoff between maximizing the probability of forming a pair and maximizing the number of possible variants of a desired secondary structure that can occur in the library. They also indicate that the optimal coding strategy for a library depends on the complexity of the motif being characterized. Because this approach provides a simple way to generate libraries enriched for sequences with the potential to form a specific secondary structure, we anticipate that it should be useful for the optimization and structural characterization of functional nucleic acid motifs.
G-quadruplexes are noncanonical nucleic acid structures formed by stacked guanine tetrads. They are capable of a range of functions and thought to play widespread biological roles. This diversity raises an important question: what determines the biochemical specificity of G-quadruplex structures? The answer is particularly important from the perspective of biological regulation because genomes can contain hundreds of thousands of G-quadruplexes with a range of functions. Here we analyze the specificity of each sequence in a 496-member library of variants of a reference G-quadruplex with respect to five functions. Our analysis shows that the sequence requirements of G-quadruplexes with these functions are different from one another, with some mutations altering biochemical specificity by orders of magnitude. Mutations in tetrads have larger effects than mutations in loops, and changes in specificity are correlated with changes in multimeric state. To complement our biochemical data we determined the solution structure of a monomeric G-quadruplex from the library. The stacked and accessible tetrads rationalize why monomers tend to promote a model peroxidase reaction and generate fluorescence. Our experiments support a model in which the sequence requirements of G-quadruplexes with different functions are overlapping but distinct. This has implications for biological regulation, bioinformatics, and drug design.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.