Abstract:Significance
SARS-CoV-2 continues to evolve through emerging variants, more frequently observed with higher transmissibility. Despite the wide application of vaccines and antibodies, the selection pressure on the Spike protein may lead to further evolution of variants that include mutations that can evade immune response. To catch up with the virus’s evolution, we introduced a deep learning approach to redesign the complementarity-determining regions (CDRs) to target multiple virus variants and obtai… Show more
“…To the end user, guiding evolution via pretrained, unsupervised models is less resource-intensive than collecting enough task-specific data to train a supervised model [33]. Our techniques can also be used in conjunction with supervised approaches [8], [31]- [34], [52]- [55], and supervising a model over multiple experimental rounds might ultimately lead to higher fitness. However, in many practical settings (for example, the rapid development of sotrovimab in response to the COVID-19 pandemic [35]), the efficiency of an unsupervised, single-round approach is preferable to a protracted, multi-round (machinelearning-guided) directed evolution campaign.…”
Section: Discussionmentioning
confidence: 99%
“…In particular, we hypothesized that the predictive capabilities of protein language models might enable a researcher to provide only a single, wildtype antibody sequence to the algorithm and receive a small, manageable set (~10 1 ) of high-likelihood variants to experimentally measure for desirable properties. This is a very general setting that does not assume knowledge of protein structure or task-specific training data, thereby avoiding the resource-intensive processes associated with structure determination [34] or high-throughput screens [33]. A major question, however, is if higher evolutionary likelihood would efficiently translate to higher fitness.…”
Section: Efficient Affinity Maturation With General Protein Language ...mentioning
confidence: 99%
“…This performance is especially striking given that the underlying language models are completely unsupervised and have no initial task-specific training data. Also notable is that around half of the amino acid substitutions that improve affinity are located in antibody framework regions, which are much less mutated during natural affinity maturation [11] and are thus often excluded from artificial evolution [33], [34].…”
Natural evolution must explore a vast landscape of possible sequences for desirable yet rare mutations, suggesting that learning from natural evolutionary strategies could accelerate artificial evolution. Here, we report that deep learning algorithms known as protein language models can evolve human antibodies with high efficiency, despite providing the models with no information about the target antigen, binding specificity, or protein structure, and also requiring no additional task-specific finetuning or supervision. We performed language-model-guided affinity maturation of seven diverse antibodies, screening 20 or fewer variants of each antibody across only two rounds of evolution. Our evolutionary campaigns improved the binding affinities of four clinically relevant antibodies up to 7-fold and three unmatured antibodies up to 160-fold across diverse viral antigens, with many designs also demonstrating improved thermostability and viral neutralization activity. Notably, our algorithm requires only a single wildtype sequence and computes recommended amino acid changes in less than a second. Moreover, the same models that improve antibody binding also guide efficient evolution across diverse protein families and selection pressures, indicating that these results generalize to many natural settings. Contrary to prevailing notions of evolution as difficult and resource-intensive, our results suggest that when constrained to a narrow manifold of evolutionary plausibility, evolution can become much easier, which we refer to as the “efficient manifold hypothesis.”
“…To the end user, guiding evolution via pretrained, unsupervised models is less resource-intensive than collecting enough task-specific data to train a supervised model [33]. Our techniques can also be used in conjunction with supervised approaches [8], [31]- [34], [52]- [55], and supervising a model over multiple experimental rounds might ultimately lead to higher fitness. However, in many practical settings (for example, the rapid development of sotrovimab in response to the COVID-19 pandemic [35]), the efficiency of an unsupervised, single-round approach is preferable to a protracted, multi-round (machinelearning-guided) directed evolution campaign.…”
Section: Discussionmentioning
confidence: 99%
“…In particular, we hypothesized that the predictive capabilities of protein language models might enable a researcher to provide only a single, wildtype antibody sequence to the algorithm and receive a small, manageable set (~10 1 ) of high-likelihood variants to experimentally measure for desirable properties. This is a very general setting that does not assume knowledge of protein structure or task-specific training data, thereby avoiding the resource-intensive processes associated with structure determination [34] or high-throughput screens [33]. A major question, however, is if higher evolutionary likelihood would efficiently translate to higher fitness.…”
Section: Efficient Affinity Maturation With General Protein Language ...mentioning
confidence: 99%
“…This performance is especially striking given that the underlying language models are completely unsupervised and have no initial task-specific training data. Also notable is that around half of the amino acid substitutions that improve affinity are located in antibody framework regions, which are much less mutated during natural affinity maturation [11] and are thus often excluded from artificial evolution [33], [34].…”
Natural evolution must explore a vast landscape of possible sequences for desirable yet rare mutations, suggesting that learning from natural evolutionary strategies could accelerate artificial evolution. Here, we report that deep learning algorithms known as protein language models can evolve human antibodies with high efficiency, despite providing the models with no information about the target antigen, binding specificity, or protein structure, and also requiring no additional task-specific finetuning or supervision. We performed language-model-guided affinity maturation of seven diverse antibodies, screening 20 or fewer variants of each antibody across only two rounds of evolution. Our evolutionary campaigns improved the binding affinities of four clinically relevant antibodies up to 7-fold and three unmatured antibodies up to 160-fold across diverse viral antigens, with many designs also demonstrating improved thermostability and viral neutralization activity. Notably, our algorithm requires only a single wildtype sequence and computes recommended amino acid changes in less than a second. Moreover, the same models that improve antibody binding also guide efficient evolution across diverse protein families and selection pressures, indicating that these results generalize to many natural settings. Contrary to prevailing notions of evolution as difficult and resource-intensive, our results suggest that when constrained to a narrow manifold of evolutionary plausibility, evolution can become much easier, which we refer to as the “efficient manifold hypothesis.”
“…Due to the rapid mutation rate, vaccines may lose effectiveness against COVID-19 variants. In fact, it has been reported that recombinant trimeric RBD [5] and neutralizing antibodies [6] have neutralizing effects against the Beta and Delta variants, not Omicron, and broad protection against the Omicron variant has not yet been reported. Therefore, there is an urgent need to develop a broad-spectrum vaccine against the different COVID-19 variants.…”
It has been reported that the novel coronavirus (COVID-19) has caused more than 286 million cases and 5.4 million deaths to date. Several strategies have been implemented globally, such as social distancing and the development of the vaccines. Several severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants have appeared, such as Alpha, Beta, Gamma, Delta, and Omicron. With the rapid spread of the novel coronavirus and the rapidly changing mutants, the development of a broad-spectrum multivalent vaccine is considered to be the most effective way to defend against the constantly mutating virus. Here, we evaluated the immunogenicity of the multivalent COVID-19 inactivated vaccine. Mice were immunized by multivalent COVID-19 inactivated vaccine, and the neutralizing antibodies in serum were analyzed. The results show that HB02 + Delta + Omicron trivalent vaccine could provide broad spectrum protection against HB02, Beta, Delta, and Omicron virus. Additionally, the different multivalent COVID-19 inactivated vaccines could enhance cellular immunity. Together, our findings suggest that the multivalent COVID-19 inactivated vaccine can provide broad spectrum protection against HB02 and other virus variants in humoral and cellular immunity, providing new ideas for the development of a broad-spectrum COVID-19 vaccine.
“…Protein engineering is a growing area of research in which scientists use a variety of methods to design new proteins that can perform certain functions. For instance, enzymes that can biodegrade plastics, materials inspired by spider silk, or antibodies to neutralize viruses ( Lu et al, 2022 ; Shan et al, 2022 ).…”
Using a neural network to predict how green fluorescent proteins respond to genetic mutations illuminates properties that could help design new proteins.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.