2023
DOI: 10.1021/acssynbio.3c00301
|View full text |Cite
|
Sign up to set email alerts
|

DeCOIL: Optimization of Degenerate Codon Libraries for Machine Learning-Assisted Protein Engineering

Jason Yang,
Julie Ducharme,
Kadina E. Johnston
et al.

Abstract: With advances in machine learning (ML)-assisted protein engineering, models based on data, biophysics, and natural evolution are being used to propose informed libraries of protein variants to explore. Synthesizing these libraries for experimental screens is a major bottleneck, as the cost of obtaining large numbers of exact gene sequences is often prohibitive. Degenerate codon (DC) libraries are a cost-effective alternative for generating combinatorial mutagenesis libraries where mutations are targeted to a h… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 64 publications
0
5
0
Order By: Relevance
“…LLMs could power these automated systems, with AI flexibly adapting to perform new types of syntheses and screens with robotic scripts written on the fly. At the same time, multiple desirable properties and activity for multiple reactions could be optimized simultaneously during protein engineering campaigns, powered by generalized ML models that can utilize multimodal representations of proteins. With ever increasing amounts of data on protein structures and sequence-fitness pairs, and new tools to conduct experiments and make ML methods for proteins more accessible to the broader community, the future of ML-assisted protein engineering is bright.…”
Section: Conclusion: Toward General Self-driven Protein Engineeringmentioning
confidence: 99%
“…LLMs could power these automated systems, with AI flexibly adapting to perform new types of syntheses and screens with robotic scripts written on the fly. At the same time, multiple desirable properties and activity for multiple reactions could be optimized simultaneously during protein engineering campaigns, powered by generalized ML models that can utilize multimodal representations of proteins. With ever increasing amounts of data on protein structures and sequence-fitness pairs, and new tools to conduct experiments and make ML methods for proteins more accessible to the broader community, the future of ML-assisted protein engineering is bright.…”
Section: Conclusion: Toward General Self-driven Protein Engineeringmentioning
confidence: 99%
“…Protein stability is a critical factor in protein design, ensuring the designed protein can fold into and maintain its specific three-dimensional structure 66 . Stability is typically quantified by the change in Gibbs free energy (ΔΔ𝐺) of folding between the redesigned protein and the wild-type (original) protein, where ΔΔ𝐺 = Δ𝐺 orig − Δ𝐺 redesign .…”
Section: Generating Protein Pockets For Therapeutic Small Molecule Li...mentioning
confidence: 99%
“…This study demonstrated combining DL with evolutionary density modeling allows unsupervised estimation of protein fitness. Also, Arnold et al developed DeCOIL [216] to optimize Design Choices (DCs) for creating well-informed combinatorial mutagenesis protein variant libraries based on favorable ΔΔG fold value and EVmutation. The latter were validated on two enzymes: B1 domain of protein G (GB1) and tryptophan synthase (TrpB) from Thermotoga maritima.…”
Section: Deep Learning Approachesmentioning
confidence: 99%