Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs.
Auxotrophic strains of Agrobacterium tumefaciens were generated for use in liquid co-culture with plant tissue for transient gene expression. Twenty-one auxotrophs were recovered from 1,900 tetracycline-resistant insertional mutants generated with a suicide vector transposon mutagenesis system. Twelve of these auxotrophs were characterized on a nutrient matrix. Isolates were screened for growth in plant cell and root culture, and three auxotrophs were identified that had limited growth: adenine (ade-24), leucine (leu-27), and cysteine (cys-32). Ade-24 displayed poor T-DNA delivery in a transient expression test delivering GUS from a binary vector, while cys-32 displayed the best ability to deliver DNA of these three auxotrophs. The growth yield of cys-32 on cysteine was assessed to provide a quantitative basis for co-culture nutrient supplementation. The utility of cys-32 for delivering T-DNA to plant tissues is demonstrated, where an 85-fold enhancement in GUS expression over wild-type A. tumefaciens was achieved.
A viral vector based on the bean yellow dwarf virus was investigated for its potential to increase transient gene expression. An intron-containing GUS reporter gene and the cis-acting viral regulatory elements were incorporated in the viral vector and could be complemented by the viral replication-associated proteins provided on a secondary vector. All vectors were delivered to Nicotiana glutinosa plant cell suspension or hairy root cultures via auxotrophic Agrobacterium tumefaciens. Cell culture generated greater yield of reporter gene expression than did root culture, as a result of the limitation imposed on roots to express the protein only in surface tissue containing actively dividing cells. Reporter gene expression increased for cell culture when the reporter gene construct was co-delivered with the construct supplying both viral replication associated proteins (REP and REPA); gene expression decreased when the construct supplying only the viral REP protein was co-delivered. Reporter protein expression increased from 0.091% for the reporter construct alone to 0.22% total soluble protein (% TSP) when the viral Rep-supplying vector was co-delivered with the reporter gene construct. Reporter protein was generated 3 days after the initiation of bacterial co-culture, providing for rapid generation of heterologous protein in cell culture.
Developing predictive models of multi-protein genetic systems to understand and optimize their behavior remains a combinatorial challenge, particularly when measurement throughput is limited. We developed a computational approach to build predictive models and identify optimal sequences and expression levels, while circumventing combinatorial explosion. Maximally informative genetic system variants were first designed by the RBS Library Calculator, an algorithm to design sequences for efficiently searching a multi-protein expression space across a > 10,000-fold range with tailored search parameters and well-predicted translation rates. We validated the algorithm's predictions by characterizing 646 genetic system variants, encoded in plasmids and genomes, expressed in six gram-positive and gram-negative bacterial hosts. We then combined the search algorithm with system-level kinetic modeling, requiring the construction and characterization of 73 variants to build a sequence-expression-activity map (SEAMAP) for a biosynthesis pathway. Using model predictions, we designed and characterized 47 additional pathway variants to navigate its activity space, find optimal expression regions with desired activity response curves, and relieve rate-limiting steps in metabolism. Creating sequence-expression-activity maps accelerates the optimization of many protein systems and allows previous measurements to quantitatively inform future designs.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.