Using nanopores to sequence biopolymers was proposed more than a decade ago1. Recent advances in enzyme-based control of DNA translocation2 and in DNA nucleotide resolution using modified biological pores3 have satisfied two technical requirements of a functional nanopore DNA sequencing device. Nanopore sequencing of proteins was also envisioned1. Although proteins have been shown to move through nanopores4, 5, 6, a technique to unfold proteins for processive translocation has yet to be demonstrated. Here we describe controlled unfolding and translocation of proteins through the α-hemolysin (α-HL) pore using the AAA+ unfoldase ClpX. Sequence-dependent features of individual engineered proteins were detected during translocation. These results demonstrate that molecular motors can reproducibly drive proteins through a model nanopore—a feature required for protein sequence analysis using this single-molecule technology.
DNA is an excellent medium for data archival. Recent efforts have illustrated the potential for information storage in DNA using synthesized oligonucleotides assembled in vitro1–6. A relatively unexplored avenue of information storage in DNA is the ability to write information into the genome of a living cell by the addition of nucleotides over time. Using the Cas1-Cas2 integrase, the CRISPR-Cas microbial immune system stores the nucleotide content of invading viruses to confer adaptive immunity7. Harnessed, this system has the potential to write arbitrary information into the genome8. Here, we use the CRISPR-Cas system to encode images and a short movie into the genomes of a population of living bacteria. In doing so, we push the technical limits of this information storage system and optimize strategies to minimize those limitations. We additionally uncover underlying principles of the CRISPR-Cas adaptation system, including sequence determinants of spacer acquisition relevant for understanding both the basic biology of bacterial adaptation as well as its technological applications. This work demonstrates that this system can capture and stably store practical amounts of real data within the genomes of populations of living cells.
The ability to write a stable record of identified molecular events into a specific genomic locus would enable the examination of long cellular histories and have many applications, ranging from developmental biology to synthetic devices. We show that the type I-E CRISPR-Cas system of E. coli can mediate acquisition of defined pieces of synthetic DNA. We harnessed this feature to generate records of specific DNA sequences into a population of bacterial genomes. We then applied directed evolution to alter the recognition of a protospacer adjacent motif by the Cas1-Cas2 complex, which enabled recording in two modes simultaneously. We used this system to reveal aspects of spacer acquisition, fundamental to the CRISPR-Cas adaptation process. These results lay the foundations of a multimodal intracellular recording device.
Previously we showed that the protein unfoldase ClpX could facilitate translocation of individual proteins through the α-hemolysin nanopore. This results in ionic current fluctuations that correlate with unfolding and passage of intact protein strands through the pore lumen. It is plausible that this technology could be used to identify protein domains and structural modifications at the single-molecule level that arise from subtle changes in primary amino acid sequence (e.g., point mutations). As a test, we engineered proteins bearing well-characterized domains connected in series along an ∼700 amino acid strand. Point mutations in a titin immunoglobulin domain (titin I27) and point mutations, proteolytic cleavage, and rearrangement of beta-strands in green fluorescent protein (GFP), caused ionic current pattern changes for single strands predicted by bulk phase and force spectroscopy experiments. Among these variants, individual proteins could be classified at 86-99% accuracy using standard machine learning tools. We conclude that a ClpXP-nanopore device can discriminate among distinct protein domains, and that sequence-dependent variations within those domains are detectable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.