Recoding--the repurposing of genetic codons--is a powerful strategy for enhancing genomes with functions not commonly found in nature. Here, we report computational design, synthesis, and progress toward assembly of a 3.97-megabase, 57-codon Escherichia coli genome in which all 62,214 instances of seven codons were replaced with synonymous alternatives across all protein-coding genes. We have validated 63% of recoded genes by individually testing 55 segments of 50 kilobases each. We observed that 91% of tested essential genes retained functionality with limited fitness effect. We demonstrate identification and correction of lethal design exceptions, only 13 of which were found in 2229 genes. This work underscores the feasibility of rewriting genomes and establishes a framework for large-scale design, assembly, troubleshooting, and phenotypic analysis of synthetic organisms.
Selection has been invaluable for genetic manipulation, although counter-selection has historically exhibited limited robustness and convenience. TolC, an outer membrane pore involved in transmembrane transport in E. coli, has been implemented as a selectable/counter-selectable marker, but counter-selection escape frequency using colicin E1 precludes using tolC for inefficient genetic manipulations and/or with large libraries. Here, we leveraged unbiased deep sequencing of 96 independent lineages exhibiting counter-selection escape to identify loss-of-function mutations, which offered mechanistic insight and guided strain engineering to reduce counter-selection escape frequency by ∼40-fold. We fundamentally improved the tolC counter-selection by supplementing a second agent, vancomycin, which reduces counter-selection escape by 425-fold, compared colicin E1 alone. Combining these improvements in a mismatch repair proficient strain reduced counter-selection escape frequency by 1.3E6-fold in total, making tolC counter-selection as effective as most selectable markers, and adding a valuable tool to the genome editing toolbox. These improvements permitted us to perform stable and continuous rounds of selection/counter-selection using tolC, enabling replacement of 10 alleles without requiring genotypic screening for the first time. Finally, we combined these advances to create an optimized E. coli strain for genome engineering that is ∼10-fold more efficient at achieving allelic diversity than previous best practices.
The degeneracy of the genetic code allows nucleic acids to encode amino acid identity as well as noncoding information for gene regulation and genome maintenance. The rare arginine codons AGA and AGG (AGR) present a case study in codon choice, with AGRs encoding important transcriptional and translational properties distinct from the other synonymous alternatives (CGN). We created a strain of Escherichia coli with all 123 instances of AGR codons removed from all essential genes. We readily replaced 110 AGR codons with the synonymous CGU codons, but the remaining 13 "recalcitrant" AGRs required diversification to identify viable alternatives. Successful replacement codons tended to conserve local ribosomal binding site-like motifs and local mRNA secondary structure, sometimes at the expense of amino acid identity. Based on these observations, we empirically defined metrics for a multidimensional "safe replacement zone" (SRZ) within which alternative codons are more likely to be viable. To evaluate synonymous and nonsynonymous alternatives to essential AGRs further, we implemented a CRISPR/Cas9-based method to deplete a diversified population of a wild-type allele, allowing us to evaluate exhaustively the fitness impact of all 64 codon alternatives. Using this method, we confirmed the relevance of the SRZ by tracking codon fitness over time in 14 different genes, finding that codons that fall outside the SRZ are rapidly depleted from a growing population. Our unbiased and systematic strategy for identifying unpredicted design flaws in synthetic genomes and for elucidating rules governing codon choice will be crucial for designing genomes exhibiting radically altered genetic codes.codon choice | genome editing | recoded genomes
BackgroundVibrio Pathogenicity Island-2 (VPI-2) is a 57 kb region present in choleragenic V. cholerae isolates that is required for growth on sialic acid as a sole carbon source. V. cholerae non-O1/O139 pathogenic strains also contain VPI-2, which in addition to sialic acid catabolism genes also encodes a type 3 secretion system in these strains. VPI-2 integrates into chromosome 1 at a tRNA-serine site and encodes an integrase intV2 (VC1758) that belongs to the tyrosine recombinase family. IntV2 is required for VPI-2 excision from chromosome 1, which occurs at very low levels, and formation of a non-replicative circular intermediate.ResultsWe determined the conditions and the factors that affect excision of VPI-2 in V. cholerae N16961. We demonstrate that excision from chromosome 1 is induced at low temperature and after sublethal UV-light irradiation treatment. In addition, after UV-light irradiation compared to untreated cells, cells showed increased expression of three genes, intV2 (VC1758), and two putative recombination directionality factors (RDFs), vefA (VC1785) and vefB (VC1809) encoded within VPI-2. We demonstrate that along with IntV2, the RDF VefA is essential for excision. We constructed a knockout mutant of vefA in V. cholerae N16961, and found that no excision of VPI-2 occurred, indicating that a functional vefA gene is required for excision. Deletion of the second RDF encoded by vefB did not result in a loss of excision. Among Vibrio species in the genome database, we identified 27 putative RDFs within regions that also encoded IntV2 homologues. Within each species the RDFs and their cognate IntV2 proteins were associated with different island regions suggesting that this pairing is widespread.ConclusionsWe demonstrate that excision of VPI-2 is induced under some environmental stress conditions and we show for the first time that an RDF encoded within a pathogenicity island in V. cholerae is required for excision of the region.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.