BackgroundProduction of proteins as therapeutic agents, research reagents and molecular tools frequently depends on expression in heterologous hosts. Synthetic genes are increasingly used for protein production because sequence information is easier to obtain than the corresponding physical DNA. Protein-coding sequences are commonly re-designed to enhance expression, but there are no experimentally supported design principles.Principal FindingsTo identify sequence features that affect protein expression we synthesized and expressed in E. coli two sets of 40 genes encoding two commercially valuable proteins, a DNA polymerase and a single chain antibody. Genes differing only in synonymous codon usage expressed protein at levels ranging from undetectable to 30% of cellular protein. Using partial least squares regression we tested the correlation of protein production levels with parameters that have been reported to affect expression. We found that the amount of protein produced in E. coli was strongly dependent on the codons used to encode a subset of amino acids. Favorable codons were predominantly those read by tRNAs that are most highly charged during amino acid starvation, not codons that are most abundant in highly expressed E. coli proteins. Finally we confirmed the validity of our models by designing, synthesizing and testing new genes using codon biases predicted to perform well.ConclusionThe systematic analysis of gene design parameters shown in this study has allowed us to identify codon usage within a gene as a critical determinant of achievable protein expression levels in E. coli. We propose a biochemical basis for this, as well as design algorithms to ensure high protein production from synthetic genes. Replication of this methodology should allow similar design algorithms to be empirically derived for any expression system.
The re-use of previously validated designs is critical to the evolution of synthetic biology from a research discipline to an engineering practice. Here we describe the Synthetic Biology Open Language (SBOL), a proposed data standard for exchanging designs within the synthetic biology community. SBOL represents synthetic biology designs in a communitydriven, formalized format for exchange between software tools, research groups and commercial service providers. The SBOL Developers Group has implemented SBOL as an XML/RDF serialization and provides software libraries and specification documentation to help developers implement SBOL in their own software. We describe early successes, including a demonstration of the utility of SBOL for information exchange between several different software tools and repositories from both academic and industrial partners. As a community-driven standard, SBOL will be updated as synthetic biology evolves to provide specific capabilities for different aspects of the synthetic biology workflow.Synthetic biology treats biological organisms as a new technological medium with a unique set of characteristics, such as the ability to self-repair, evolve and replicate. These characteristics create their own engineering challenges, but offer a rich and largely untapped source of potential applications across a broad range of sectors 1,2 . Applications such as biomolecular computing 3 , metabolic engineering 4 , or reconstruction and exploration of natural cell biology 5,6 commonly require the design of new genetically encoded systems. As engineers, synthetic biologists most often base their designs on previously described 'DNA segments' (see Supplementary Table 1 for definitions of selected terms) to meet their design requirements. Reuse of the DNA sequence for these segments involves their exchange between laboratories and their hierarchical composition to form devices and systems with higher level function.Every engineering field relies on a set of 'standards' 7 that practitioners follow to enable the exchange and reuse of designs for 'systems' , 'devices' and 'components' . Similarly, the representation of synthetic biology designs using computer-readable 'data standards' has the potential to facilitate the forward engineering of novel biological systems from previously characterized devices and components. For example, such standards could enable synthetic biology companies to offer catalogs of devices and components by means of computerreadable data sheets, just as modern semiconductor companies do for electronics. Such standards could also enable a synthetic biologist to develop portions of a design using one software tool, refine the design using another tool, and finally transmit it electronically to a colleague or commercial fabrication company.In order for synthetic biology designs to scale up in complexity, researchers will need to make greater use of specialized design tools and parts repositories. Seamless inter-tool communication would, for example, allow the separation of gene...
SCHEMA structure-guided recombination of 3 fungal class II cellobiohydrolases (CBH II cellulases) has yielded a collection of highly thermostable CBH II chimeras. Twenty-three of 48 genes sampled from the 6,561 possible chimeric sequences were secreted by the Saccharomyces cerevisiae heterologous host in catalytically active form. Five of these chimeras have half-lives of thermal inactivation at 63°C that are greater than the most stable parent, CBH II enzyme from the thermophilic fungus Humicola insolens, which suggests that this chimera collection contains hundreds of highly stable cellulases. Twenty-five new sequences were designed based on mathematical modeling of the thermostabilities for the first set of chimeras. Ten of these sequences were expressed in active form; all 10 retained more activity than H. insolens CBH II after incubation at 63°C. The total of 15 validated thermostable CBH II enzymes have high sequence diversity, differing from their closest natural homologs at up to 63 amino acid positions. Selected purified thermostable chimeras hydrolyzed phosphoric acid swollen cellulose at temperatures 7 to 15°C higher than the parent enzymes. These chimeras also hydrolyzed as much or more cellulose than the parent CBH II enzymes in long-time cellulose hydrolysis assays and had pH/activity profiles as broad, or broader than, the parent enzymes. Generating this group of diverse, thermostable fungal CBH II chimeras is the first step in building an inventory of stable cellulases from which optimized enzyme mixtures for biomass conversion can be formulated. biofuels ͉ cellobiohydrolase ͉ cellulose hydrolysis ͉ Trichoderma reesei ͉ CBH II T he performance of cellulase mixtures in biomass conversion processes depends on many enzyme properties including stability, product inhibition, synergy among different cellulase components, productive binding versus nonproductive adsorption and pH dependence, in addition to the cellulose substrate physical state and composition. Given the multivariate nature of cellulose hydrolysis, it is desirable to have diverse cellulases to choose from to optimize enzyme formulations for different applications and feedstocks. Recent studies have documented the superior performance of cellulases from thermophilic fungi relative to their mesophilic counterparts in laboratory scale biomass conversion processes (1, 2), where enhanced stability leads to retention of activity over longer periods of time at both moderate and elevated temperatures. Fungal cellulases are attractive because they are highly active and can be expressed in fungal hosts such as Hypocrea jecorina (anamorph Trichoderma reesei) at levels up to 40 g/L in the supernatant. Unfortunately, the set of documented thermostable fungal cellulases is small. In the case of the processive cellobiohydrolase class II (CBH II) enzymes, Ͻ10 natural thermostable gene sequences are annotated in the CAZy database (www.cazy.org). This limited number, combined with the difficulty of using directed evolution to generate diverse thermostable...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.