2021
DOI: 10.1101/2021.12.13.472419
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generative language modeling for antibody design

Abstract: Successful development of monoclonal antibodies (mAbs) for therapeutic applications is hindered by developability issues such as low solubility, low thermal stability, high aggregation, and high immunogenicity. The discovery of more developable mAb candidates relies on high-quality antibody libraries for isolating candidates with desirable properties. We present Immunoglobulin Language Model (IgLM), a deep generative language model for generating synthetic libraries by re-designing variable-length spans of ant… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
38
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 37 publications
(42 citation statements)
references
References 59 publications
0
38
0
Order By: Relevance
“…Deep learning methods trained on antibody sequences and structures hold great promise for design of novel therapeutic and diagnostic molecules. Generative models trained on large numbers of natural antibody sequences can produce effective libraries for antibody discovery (28, 29). Self-supervised models have also proven effective for humanization of antibodies (27).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Deep learning methods trained on antibody sequences and structures hold great promise for design of novel therapeutic and diagnostic molecules. Generative models trained on large numbers of natural antibody sequences can produce effective libraries for antibody discovery (28, 29). Self-supervised models have also proven effective for humanization of antibodies (27).…”
Section: Discussionmentioning
confidence: 99%
“…Models trained for masked language modeling have been shown to learn meaningful representations of immune repertoire sequences (21, 25, 26), and even repurposed to humanize antibodies (27). Generative models trained on sequence infilling have been shown to generate high-quality antibody libraries (28, 29).…”
Section: Introductionmentioning
confidence: 99%
“…Therefore, the models are not evaluated in terms of which downstream tasks can be applied via transfer learning. Recently, attempts appear, which utilize large language models in repertoire analysis (133)(134)(135)(136)(137)180). In AntiBERTa (137), fine-tuning for a downstream task is also investigated.…”
Section: Discussionmentioning
confidence: 99%
“…The utilization of language models is not limited to embedding. In Shuai et al (137), another language model called GPT-2 ( 128) is utilized for pretraing on an antibody generation model (IgLM). Because GPT-2 is designed for full sentence generation, unlike BERT, IgLM can generate new antibodies (CDRs).…”
Section: Embedding Methods Based On Representation Learningmentioning
confidence: 99%
“…An example of an antibody-specific DL model is IgLM, a language model that generates variable-length CDR sequence libraries conditioned on chain type and/or species-of-origin. [22] IgLM designed synthetic libraries are akin to naïve libraries that can be further screened to obtain a lead antibody sequence. Another antibody-specific DL model treats the problem of antibody CDR generation as an iterative sequence-structure prediction problem.…”
Section: Introductionmentioning
confidence: 99%