2022
DOI: 10.1038/s43588-022-00372-4
|View full text |Cite
|
Sign up to set email alerts
|

Unconstrained generation of synthetic antibody–antigen structures to guide machine learning methodology for antibody specificity prediction

Abstract: Machine learning (ML) is a key technology for accurate prediction of antibody-antigen binding. Two orthogonal problems hinder the application of ML to antibody-specificity prediction and the benchmarking thereof: The lack of a unified ML formalization of immunological antibody specificity prediction problems and the unavailability of large-scale synthetic benchmarking datasets of real-world relevance. Here, we developed the Absolut! software suite that enables parameter-based unconstrained generation of synthe… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
45
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 29 publications
(46 citation statements)
references
References 126 publications
1
45
0
Order By: Relevance
“…We suggest that peptide imbalance contributes more to a better performance of the models than size, a finding that was also made in antibody-antigen prediction Robert et al. ( 17 ). It will be interesting to see whether the models perform well purely because of peptide frequency, or whether other factors such as biological or physicochemical properties may influence performance.…”
Section: Discussionsupporting
confidence: 64%
“…We suggest that peptide imbalance contributes more to a better performance of the models than size, a finding that was also made in antibody-antigen prediction Robert et al. ( 17 ). It will be interesting to see whether the models perform well purely because of peptide frequency, or whether other factors such as biological or physicochemical properties may influence performance.…”
Section: Discussionsupporting
confidence: 64%
“…1C), in line with previous studies ( 24 , 63 ). In addition, AIR information may include derived features such as CDR3 length ( 64 ), physicochemical properties ( 65 ), or binding energy ( 66 ). In the case of paired chain data, the immune signal would include this information from both chains ( 3 , 15 , 16 , 67 ).…”
Section: Resultsmentioning
confidence: 99%
“…There is a potential risk that an AIRR-ML model trained on simulated data may only learn the artifacts of the simulation framework, thereby impacting the applicability of ML model-related insights to real-world scenarios. This may be addressed in the future by analyzing how ML results on synthetic data transfer to experimental data as recently demonstrated by Robert et al ( 66 ). Further work is needed on the biological understanding of immune signals for improved simulation ( 38 , 63 ).…”
Section: Discussionmentioning
confidence: 98%
“…28 The epitope, which is present on the surface of the antigen, con-sists of a continuous sequence or a discontinuous threedimensional protein structure. 29 In nature, Cd usually exists in the +2 form. Cd 2+ , which is not immunogenic due to its small molecular weight, cannot be combined with a carrier protein to form a complete antigenic epitope.…”
Section: Characterization Of the Mabmentioning
confidence: 99%