Machine learning (ML) is a key technology to enable accurate prediction of antibody-antigen binding, a prerequisite for in silico vaccine and antibody design. Two orthogonal problems hinder the current application of ML to antibody-specificity prediction and the benchmarking thereof: (i) The lack of a unified formalized mapping of immunological antibody specificity prediction problems into ML notation and (ii) the unavailability of large-scale training datasets. Here, we developed the Absolut! software suite that allows the parameter-based unconstrained generation of synthetic lattice-based 3D-antibody-antigen binding structures with ground-truth access to conformational paratope, epitope, and affinity. We show that Absolut!-generated datasets recapitulate critical biological sequence and structural features that render antibody-antigen binding prediction challenging. To demonstrate the immediate, high-throughput, and large-scale applicability of Absolut!, we have created an online database of 1 billion antibody-antigen structures, the extension of which is only constrained by moderate computational resources. We translated immunological antibody specificity prediction problems into ML tasks and used our database to investigate paratope-epitope binding prediction accuracy as a function of structural information encoding, dataset size, and ML method, which is unfeasible with existing experimental data. Furthermore, we found that in silico investigated conditions, predicted to increase antibody specificity prediction accuracy, align with and extend conclusions drawn from experimental antibody-antigen structural data. In summary, the Absolut! framework enables the development and benchmarking of ML strategies for biotherapeutics discovery and design.
Generative machine learning (ML) has been postulated to be a major driver in the computational design of antigen-specific monoclonal antibodies (mAb). However, efforts to confirm this hypothesis have been hindered by the infeasibility of testing arbitrarily large numbers of antibody sequences for their most critical design parameters: paratope, epitope, affinity, and developability. To address this challenge, we leveraged a lattice-based antibody-antigen binding simulation framework, which incorporates a wide range of physiological antibody binding parameters. The simulation framework enables both the computation of antibody-antigen 3D-structures as well as functions as an oracle for unrestricted prospective evaluation of the antigen specificity of ML-generated antibody sequences. We found that a deep generative model, trained exclusively on antibody sequence (1D) data can be used to design native-like conformational (3D) epitope-specific antibodies, matching or exceeding the training dataset in affinity and developability variety. Furthermore, we show that transfer learning enables the generation of high-affinity antibody sequences from low-N training data. Finally, we validated that the antibody design insight gained from simulated antibody-antigen binding data is applicable to experimental real-world data. Our work establishes a priori feasibility and the theoretical foundation of high-throughput ML-based mAb design.HighlightsA large-scale dataset of 70M [3 orders of magnitude larger than the current state of the art] synthetic antibody-antigen complexes, that reflect biological complexity, allows the prospective evaluation of antibody generative deep learningCombination of generative learning, synthetic antibody-antigen binding data, and prospective evaluation shows that deep learning driven antibody design and discovery at an unconstrained level is feasibleTransfer learning (low-N learning) coupled to generative learning shows that antibody-binding rules may be transferred across unrelated antibody-antigen complexesExperimental validation of antibody-design conclusions drawn from deep learning on synthetic antibody-antigen binding dataGraphical abstractWe leverage large synthetic ground-truth data to demonstrate the (A,B) unconstrained deep generative learning-based generation of native-like antibody sequences, (C) the prospective evaluation of conformational (3D) affinity, paratope-epitope pairs, and developability. (D) Finally, we show increased generation quality of low-N-based machine learning models via transfer learning.
Antibody-based immunotherapies require the tedious identification and development of antibodies with specific properties. In particular, vaccine development for mutating pathogens is challenged by their fast evolution, the complexity of immunodominance, and the heterogeneous immune history of individuals. Mathematical models are critical for predicting successful vaccine conditions or designing potent antibodies. Existing models are limited by their abstract and poorly structural representations of antigen epitopes. Here, we propose a structural lattice-based model for antibody-antigen affinity. An efficient algorithm is given that predicts the best binding structure of an antibody's amino acid sequence around an antigen with shortened computational time. It is suitable for large simulations of affinity maturation. This structural representation contains key physiological properties, such as affinity jumps and cross-reactivity, and successfully reflects the topology of antigen epitopes, such as pockets and shielded residues. We perform in silico immunizations via germinal center simulations and show that our model can explain complex phenomena like recognition of the same epitope from unrelated clones. We show that the use of cocktails of similar epitopes promotes the development of cross-reactive antibodies. This model opens a new avenue for optimizing multivalent vaccines with combined cocktails or sequential immunizations, and to reveal reasons for vaccine success or failure on a structural basis.
12CTLA4 is an essential negative regulator of T cell immune responses and is a key checkpoint regulating 13 autoimmunity and anti-tumour immunity. Genetic mutations resulting in a quantitative defect in CTLA4 are 14 associated with the development of an immune dysregulation syndrome. Endocytosis of CTLA4 is rapid and 15 continuous with subsequent degradation or recycling. CTLA4 has two natural ligands, the surface transmembrane 16 proteins CD80 and CD86 that are shared with the T cell co-stimulatory receptor CD28. Upon ligation with 17 CD80/CD86, CTLA4 can remove these ligands from the opposing cells by transendocytosis. The efficiency of 18 ligand removal is thought to be highly dependent on the processes involved in CTLA4 trafficking. With a combined 19 in vitro-in silico study, we quantify the rates of CTLA4 internalization, recycling and degradation. We incorporate 20 experimental data from cell lines and primary human T cells. Our model provides a framework for exploring the 21 impact of altered affinity of natural ligands or therapeutic anti-CTLA4 antibodies and for predicting the effect of 22 clinically relevant CTLA4 pathway mutations. The presented methodology for extracting trafficking rates can be 23 transferred to the study of other transmembrane proteins. 24 1 In a healthy organism, the immune system has to maintain a balance between immune activation and inhibition. 26 The decision between these two outcomes is influenced by receptors which are not specific for any antigenic stimulus. 27CD28 and CTLA4 are two such transmembrane receptors expressed by lymphocytes with opposing regulatory functions. 28 These receptors bind to the same co-stimulatory ligands, CD80 and CD86, which are expressed by antigen presenting 29 cells [1, 2]. While ligation of CD28 with co-stimulatory ligands is essential for full T cell activation and effector 30 functions [3][4][5], CTLA4 inhibits excess or aberrant co-stimulation of T cells by competing for CD28 ligands and 31 thereby preventing uncontrolled T cell activation and clonal expansion of cells specific for healthy tissues [6][7][8]. 32CTLA4 molecules are mostly observed in cytoplasmic vesicles [9][10][11][12] by the virtue of their interaction with the µ2 33 subunit of the clathrin adaptor protein complex AP2 [13][14][15][16]. In contrast, CD28 is present on the plasma membrane 34 with a slow turnover rate [17]. The unusual localization of CTLA4 raises questions about the mode of CTLA4 action. 35 For example, it is not clear how the particular distribution of CTLA4 molecules results from dynamic trafficking between 36 cytosol and plasma membrane. The impact of various parameters affecting internalization, degradation and recycling 37 rates all have the potential to influence the cellular distribution and function of CTLA4. Accordingly, mathematical 38 models are useful in understanding and predicting the impact of changes in these parameters on CTLA4 localization 39 and capacity to elicit suppressive function. 40 Early data suggested that CT...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.