Aptamers are short single-stranded RNA/DNA molecules that bind to specific target molecules. Aptamers with high binding-affinity and target specificity are identified using an in vitro procedure called high throughput systematic evolution of ligands by exponential enrichment (HT-SELEX). However, the development of aptamer affinity reagents takes a considerable amount of time and is costly because HT-SELEX produces a large dataset of candidate sequences, some of which have insufficient binding-affinity. Here, we present RNA aptamer Ranker (RaptRanker), a novel in silico method for identifying high binding-affinity aptamers from HT-SELEX data by scoring and ranking. RaptRanker analyzes HT-SELEX data by evaluating the nucleotide sequence and secondary structure simultaneously, and by ranking according to scores reflecting local structure and sequence frequencies. To evaluate the performance of RaptRanker, we performed two new HT-SELEX experiments, and evaluated binding affinities of a part of sequences that include aptamers with low binding-affinity. In both datasets, the performance of RaptRanker was superior to Frequency, Enrichment and MPBind. We also confirmed that the consideration of secondary structures is effective in HT-SELEX data analysis, and that RaptRanker successfully predicted the essential subsequence motifs in each identified sequence.
Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). Various candidates are limited by actual sequencing data from an experiment. Here we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimensional latent space on the basis of motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery.
RNA aptamers are RNA molecules that bind to a target molecule with high affinity and specificity using uniquely-folded tertiary structures. RNA aptamers are selected from an RNA pool typically comprising up to 1015 different sequences generated by iterative steps of selection and amplification known as Systematic Evolution of Ligands by EXponential enrichment (SELEX). Over several rounds of SELEX, the diversity of the RNA pool decreases and the aptamers are enriched. Hence, monitoring of the enrichment of these RNA pools is critical for the successful selection of aptamers, and several methods for monitoring them have been developed. In this study, we measured one-dimensional imino proton NMR spectra of RNA pools during SELEX. The spectrum of the initial RNA pool indicates that the RNAs adopt tertiary structures. The structural diversity of the RNA pools was shown to depend highly on the design of the primer-binding sequence. Furthermore, we demonstrate that enrichment of RNA aptamers can be monitored using NMR. The RNA pools can be recovered from the NMR tube after measurement of NMR spectra. We also can monitor target binding in the NMR tubes. Thus, we propose using NMR to monitor the enrichment of structured aptamers during the SELEX process.
The plasmid ColE2-P9 Rep protein specifically binds to the cognate replication origin to initiate DNA replication. The replicons of the plasmids ColE2-P9 and ColE3-CA38 are closely related, although the actions of the Rep proteins on the origins are specific to the plasmids. The previous chimera analysis identified two regions, regions A and B, in the Rep proteins and two sites, ␣ and , in the origins as specificity determinants and showed that when each component of the region A-site ␣ pair and the region B-site  pair is derived from the same plasmid, plasmid DNA replication is efficient. It is also indicated that the replication specificity is mainly determined by region A and site ␣. By using an electrophoretic mobility shift assay, we demonstrated that region B and site  play a critical role for stable Rep protein-origin binding and, furthermore, that 284-Thr in this region of the ColE2 Rep protein and the corresponding 293-Trp of the ColE3 Rep protein mainly determine the Rep-origin binding specificity. On the other hand, region A and site ␣ were involved in the efficient unwinding of several nucleotide residues around site ␣, although they were not involved in the stable binding of the Rep protein to the origin. Finally, we discussed how the action of the Rep protein on the origin involving these specificity determinants leads to the plasmid-specific replication initiation.In all organisms, DNA replication is a key event for inheritance of genetic information. Initiation of DNA replication requires interaction of the initiator protein with the specific DNA region called the replication origin and the consequent localized melting of duplex DNA, which provides a singlestranded template for establishment of replication machinery. In initiation of chromosomal DNA replication in Escherichia coli, several molecules of the initiator protein (DnaA) tightly bind to the 9-mer repeated sequences called the DnaA boxes in the chromosomal replication origin (oriC), and then the DnaB helicase is loaded onto the unwound 13-mer AT-rich region located to one side of oriC, followed by establishment of the replication machinery containing DnaG and DNA polymerase III holoenzyme (17).The plasmid ColE2-P9 (ColE2) is a circular duplex DNA molecule of about 7 kb (10) and is kept at 10 to 15 copies per chromosome (2, 12). Initiation of the plasmid replication requires host DNA polymerase I (16, 26) and a plasmid-encoded replication initiator (Rep) protein that uniquely possesses an origin-specific primase activity among bacterial plasmids (15,35), and replication proceeds in a unidirectional manner (13, 28). The Rep protein specifically binds to the replication origin (15, 33) and synthesizes a short RNA molecule of 5Ј-ppApG pA-3Ј at a specific position in the origin as a primer for initiation of DNA synthesis by DNA polymerase I (28, 29). The 32-bp minimal ColE2 origin may be divided into three functional subregions, as proposed by in vivo analyses using an exhaustive mutant set of single-base-pair substitutions (33). One of th...
Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). A variety of candidates is limited by actual sequencing data from an experiment. Here, we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimension latent space dependent on motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery. Codes are available at https://github.com/hmdlab/raptgen.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.