2022
DOI: 10.1101/2022.10.07.502662
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Machine Learning Optimization of Candidate Antibodies Yields Highly Diverse Sub-nanomolar Affinity Antibody Libraries

Abstract: Therapeutic antibodies are an important and rapidly growing drug modality. However, the design and discovery of early-stage antibody therapeutics remain a time and cost-intensive endeavor. In this work, we present an end-to-end Bayesian, language model-based method for designing large and diverse libraries of high-affinity single-chain variable fragments (scFvs). We integrate target-specific binding affinities with information from millions of natural protein sequences in a probabilistic machine learning frame… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…An alternative in silico evaluation strategy avoids the challenge of defining a meaningful oracle function by adopting a pool-based optimisation problem formulation over experimentally determined fitness landscapes [30]. Another line of works has sought to provide direct experimental validation of approaches combining uncertainty estimates with PLMs, in settings ranging from zero [14] or few-shot design [2] to single-round design given large training sets of sequence-fitness pairs [24]. In this paper, we focus on evaluating different PLM-based fitness prediction strategies in in silico settings designed to mimic applied design scenarios.…”
Section: Related Workmentioning
confidence: 99%
“…An alternative in silico evaluation strategy avoids the challenge of defining a meaningful oracle function by adopting a pool-based optimisation problem formulation over experimentally determined fitness landscapes [30]. Another line of works has sought to provide direct experimental validation of approaches combining uncertainty estimates with PLMs, in settings ranging from zero [14] or few-shot design [2] to single-round design given large training sets of sequence-fitness pairs [24]. In this paper, we focus on evaluating different PLM-based fitness prediction strategies in in silico settings designed to mimic applied design scenarios.…”
Section: Related Workmentioning
confidence: 99%
“…Code 48 is available on github: https://github.com/AIforGreatGood/ biotransfer and on Zenodo under https://doi.org/10.5281/zenodo. 7927152 for academic and/or non-profit internal research purposes.…”
Section: Statistics and Reproducibilitymentioning
confidence: 99%
“…Li et al [15] used a pre-trained LLM, Gaussian Processes (GPs), and ensemble regression models to design and screen new high-affinity single-chain fragment variable antibodies (scFvs) against a conserved coronavirus peptide. They trained their models on experimental data (26.5k heavy and 26.2k light chain sequences) that involved up to three random CDR mutations from an initial candidate.…”
Section: Introductionmentioning
confidence: 99%