Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs.
A mRNA's translation rate is controlled by several sequence determinants, including the presence of RNA structures within the N-terminal regions of its coding sequences. However, the physical rules that govern when such mRNA structures will inhibit translation remain unclear. Here, we introduced systematically designed RNA hairpins into the N-terminal coding region of a reporter protein with steadily increasing distances from the start codon, followed by characterization of their mRNA and expression levels in Escherichia coli. We found that the mRNAs’ translation rates were repressed, by up to 530-fold, when mRNA structures overlapped with the ribosome's footprint. In contrast, when the mRNA structure was located outside the ribosome's footprint, translation was repressed by <2-fold. By combining our measurements with biophysical modeling, we determined that the ribosomal footprint extends 13 nucleotides into the N-terminal coding region and, when a mRNA structure overlaps or partially overlaps with the ribosomal footprint, the free energy to unfold only the overlapping structure controlled the extent of translation repression. Overall, our results provide precise quantification of the rules governing translation initiation at N-terminal coding regions, improving the predictive design of post-transcriptional regulatory elements that regulate translation rate.
NADPH is an essential cofactor for the biosynthesis of several high-value chemicals, including isoprenoids, fatty acid-based fuels, and biopolymers. Tunable control over all potentially rate-limiting steps, including the NADPH regeneration rate, is crucial to maximizing production titers. We have rationally engineered a synthetic version of the Entner-Doudoroff pathway from Zymomonas mobilis that increased the NADPH regeneration rate in Escherichia coli MG1655 by 25-fold. To do this, we combined systematic design rules, biophysical models, and computational optimization to design synthetic bacterial operons expressing the 5-enzyme pathway, while eliminating undesired genetic elements for maximum expression control. NADPH regeneration rates from genome-integrated pathways were estimated using a NADPH-binding fluorescent reporter and by the productivity of a NADPH-dependent terpenoid biosynthesis pathway. We designed and constructed improved pathway variants by employing the RBS Library Calculator to efficiently search the 5-dimensional enzyme expression space and by performing 40 cycles of MAGE for site-directed genome mutagenesis. 624 pathway variants were screened using a NADPH-dependent blue fluorescent protein, and 22 were further characterized to determine the relationship between enzyme expression levels and NADPH regeneration rates. The best variant exhibited 25-fold higher normalized mBFP levels when compared to wild-type strain. Combining the synthetic Entner-Doudoroff pathway with an optimized terpenoid pathway further increased the terpenoid titer by 97%.
The ability to precisely modify genomes and regulate specific genes will greatly accelerate several medical and engineering applications. The CRISPR/Cas9 (Type II) system binds and cuts DNA using guide RNAs, though the variables that control its on-target and off-target activity remain poorly characterized. Here, we develop and parameterize a system-wide biophysical model of Cas9-based genome editing and gene regulation to predict how changing guide RNA sequences, DNA superhelical densities, Cas9 and crRNA expression levels, organisms and growth conditions, and experimental conditions collectively control the dynamics of dCas9-based binding and Cas9-based cleavage at all DNA sites with both canonical and non-canonical PAMs. We combine statistical thermodynamics and kinetics to model Cas9:crRNA complex formation, diffusion, site selection, reversible R-loop formation, and cleavage, using large amounts of structural, biochemical, expression, and next-generation sequencing data to determine kinetic parameters and develop free energy models. Our results identify DNA supercoiling as a novel mechanism controlling Cas9 binding. Using the model, we predict Cas9 off-target binding frequencies across the lambdaphage and human genomes, and explain why Cas9’s off-target activity can be so high. With this improved understanding, we propose several rules for designing experiments for minimizing off-target activity. We also discuss the implications for engineering dCas9-based genetic circuits.
Microfluidic droplet sorting enables the high‐throughput screening and selection of water‐in‐oil microreactors at speeds and volumes unparalleled by traditional well‐plate approaches. Most such systems sort using fluorescent reporters on modified substrates or reactions that are rarely industrially relevant. We describe a microfluidic system for high‐throughput sorting of nanoliter droplets based on direct detection using electrospray ionization mass spectrometry (ESI‐MS). Droplets are split, one portion is analyzed by ESI‐MS, and the second portion is sorted based on the MS result. Throughput of 0.7 samples s−1 is achieved with 98 % accuracy using a self‐correcting and adaptive sorting algorithm. We use the system to screen ≈15 000 samples in 6 h and demonstrate its utility by sorting 25 nL droplets containing transaminase expressed in vitro. Label‐free ESI‐MS droplet screening expands the toolbox for droplet detection and recovery, improving the applicability of droplet sorting to protein engineering, drug discovery, and diagnostic workflows.
Directed Evolution is a key technology driving the utility of biocatalysis in pharmaceutical synthesis. Conventional approaches to Directed Evolution are conducted using bacterial cells expressing enzymes in microplates, with catalyzed reactions measured by HPLC, high-performance liquid chromatography-mass spectrometry (HPLC-MS), or optical detectors, which require either long cycle times or tailor-made substrates. To better fit modern, fast-paced process chemistry development where solutions are rapidly needed for new substrates, droplet microfluidics interfaced with electrospray ionization (ESI)-MS provides a label-free high-throughput screening platform. To apply this method to industrial enzyme screening and to explore potential approaches that may further improve the overall throughput, we optimized the existing droplet–MS methods. Carryover between droplets, traditionally a significant issue, was reduced to undetectable level by replacing the stainless steel ESI needle with a Teflon needle within a capillary electrophoresis (CE)–MS source. Throughput was improved to 3 Hz with a wide range of droplet sizes (10–50 nL) by tuning the sheath flow within the CE–MS source. The optimized method was demonstrated by screening reactions using two different transaminase libraries. Good correlations (r2 ∼ 0.95) were found between the droplet–MS and LC–MS methods, with 100% match on hit variants. We further explored the capability of the system by performing in vitro transcription–translation inside the droplets and directly analyzing the intact reaction mixture droplets by MS. The synthesized protein attained comparable activity to the protein standard, and the complex samples appeared well tolerated by the MS. The success of the above applications indicates that the MS analysis of the microfluidic droplets is an available option for considerably accelerating the screening of enzyme evolution libraries.
The emergence of new therapeutic modalities requires complementary tools for their efficient syntheses. Availability of methodologies for site-selective modification of biomolecules remains a long-standing challenge, given the inherent complexity and the presence of repeating residues that bear functional groups with similar reactivity profiles. We describe a bioconjugation strategy for modification of native peptides relying on high site selectivity conveyed by enzymes. We engineered penicillin G acylases to distinguish among free amino moieties of insulin (two at amino termini and an internal lysine) and manipulate cleavable phenylacetamide groups in a programmable manner to form protected insulin derivatives. This enables selective and specific chemical ligation to synthesize homogeneous bioconjugates, improving yield and purity compared to the existing methods, and generally opens avenues in the functionalization of native proteins to access biological probes or drugs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.