Structural antibody database (SAbDab; http://opig.stats.ox.ac.uk/webapps/sabdab) is an online resource containing all the publicly available antibody structures annotated and presented in a consistent fashion. The data are annotated with several properties including experimental information, gene details, correct heavy and light chain pairings, antigen details and, where available, antibody–antigen binding affinity. The user can select structures, according to these attributes as well as structural properties such as complementarity determining region loop conformation and variable domain orientation. Individual structures, datasets and the complete database can be downloaded.
The Drug Design Data Resource (D3R) ran Grand Challenge 2015 between September 2015 and February 2016. Two targets served as the framework to test community docking and scoring methods: (i) HSP90, donated by AbbVie and the Community Structure Activity Resource (CSAR), and (ii) MAP4K4, donated by Genentech. The challenges for both target datasets were conducted in two stages, with the first stage testing pose predictions and the capacity to rank compounds by affinity with minimal structural data; and the second stage testing methods for ranking compounds with knowledge of at least a subset of the ligand-protein poses. An additional sub-challenge provided small groups of chemically similar HSP90 compounds amenable to alchemical calculations of relative binding free energy. Unlike previous blinded Challenges, we did not provide cognate receptors or receptors prepared with hydrogens and likewise did not require a specified crystal structure to be used for pose or affinity prediction in Stage 1. Given the freedom to select from over 200 crystal structures of HSP90 in the PDB, participants employed workflows that tested not only core docking and scoring technologies, but also methods for addressing water-mediated ligand-protein interactions, binding pocket flexibility, and the optimal selection of protein structures for use in docking calculations. Nearly 40 participating groups submitted over 350 prediction sets for Grand Challenge 2015. This overview describes the datasets and the organization of the challenge components, summarizes the results across all submitted predictions, and considers broad conclusions that may be drawn from this collaborative community endeavor.
A major goal in drug design is the improvement of computational methods for docking and scoring. The Community Structure Activity Resource (CSAR) aims to collect available data from industry and academia which may be used for this purpose (). Also, CSAR is charged with organizing community-wide exercises based on the collected data. The first of these exercises was aimed to gauge the overall state of docking and scoring, using a large and diverse data set of protein–ligand complexes. Participants were asked to calculate the affinity of the complexes as provided and then recalculate with changes which may improve their specific method. This first data set was selected from existing PDB entries which had binding data (Kd or Ki) in Binding MOAD, augmented with entries from PDBbind. The final data set contains 343 diverse protein–ligand complexes and spans 14 pKd. Sixteen proteins have three or more complexes in the data set, from which a user could start an inspection of congeneric series. Inherent experimental error limits the possible correlation between scores and measured affinity; R2 is limited to ∼0.9 when fitting to the data set without over parametrizing. R2 is limited to ∼0.8 when scoring the data set with a method trained on outside data. The details of how the data set was initially selected, and the process by which it matured to better fit the needs of the community are presented. Many groups generously participated in improving the data set, and this underscores the value of a supportive, collaborative effort in moving our field forward.
Computational modeling of antibody structures plays a critical role in therapeutic antibody design. Several antibody modeling pipelines exist, but no freely available methods currently model nanobodies, provide estimates of expected model accuracy, or highlight potential issues with the antibody's experimental development. Here, we describe our automated antibody modeling pipeline, ABodyBuilder, designed to overcome these issues. The algorithm itself follows the standard 4 steps of template selection, orientation prediction, complementarity-determining region (CDR) loop modeling, and side chain prediction. ABodyBuilder then annotates the ‘confidence’ of the model as a probability that a component of the antibody (e.g., CDRL3 loop) will be modeled within a root–mean square deviation threshold. It also flags structural motifs on the model that are known to cause issues during in vitro development. ABodyBuilder was tested on 4 separate datasets, including the 11 antibodies from the Antibody Modeling Assessment–II competition. ABodyBuilder builds models that are of similar quality to other methodologies, with sub–Angstrom predictions for the ‘canonical’ CDR loops. Its ability to model nanobodies, and rapidly generate models (∼30 seconds per model) widens its potential usage. ABodyBuilder can also help users in decision–making for the development of novel antibodies because it provides model confidence and potential sequence liabilities. ABodyBuilder is freely available at http://opig.stats.ox.ac.uk/webapps/abodybuilder.
Motivation: Antibody amino-acid sequences can be numbered to identify equivalent positions. Such annotations are valuable for antibody sequence comparison, protein structure modelling and engineering. Multiple different numbering schemes exist, they vary in the nomenclature they use to annotate residue positions, their definitions of position equivalence and their popularity within different scientific disciplines. However, currently no publicly available software exists that can apply all the most widely used schemes or for which an executable can be obtained under an open license.Results: ANARCI is a tool to classify and number antibody and T-cell receptor amino-acid variable domain sequences. It can annotate sequences with the five most popular numbering schemes: Kabat, Chothia, Enhanced Chothia, IMGT and AHo.Availability and implementation: ANARCI is available for download under GPLv3 license at opig.stats.ox.ac.uk/webapps/anarci. A web-interface to the program is available at the same address.Contact: deane@stats.ox.ac.uk
As part of the Community Structure-Activity Resource (CSAR) center, a set of 343 high-quality, protein–ligand crystal structures were assembled with experimentally determined Kd or Ki information from the literature. We encouraged the community to score the crystallographic poses of the complexes by any method of their choice. The goal of the exercise was to (1) evaluate the current ability of the field to predict activity from structure and (2) investigate the properties of the complexes and methods that appear to hinder scoring. A total of 19 different methods were submitted with numerous parameter variations for a total of 64 sets of scores from 16 participating groups. Linear regression and nonparametric tests were used to correlate scores to the experimental values. Correlation to experiment for the various methods ranged R2 = 0.58–0.12, Spearman ρ = 0.74–0.37, Kendall τ = 0.55–0.25, and median unsigned error = 1.00–1.68 pKd units. All types of scoring functions—force field based, knowledge based, and empirical—had examples with high and low correlation, showing no bias/advantage for any particular approach. The data across all the participants were combined to identify 63 complexes that were poorly scored across the majority of the scoring methods and 123 complexes that were scored well across the majority. The two sets were compared using a Wilcoxon rank-sum test to assess any significant difference in the distributions of >400 physicochemical properties of the ligands and the proteins. Poorly scored complexes were found to have ligands that were the same size as those in well-scored complexes, but hydrogen bonding and torsional strain were significantly different. These comparisons point to a need for CSAR to develop data sets of congeneric series with a range of hydrogen-bonding and hydrophobic characteristics and a range of rotatable bonds.
The Community Structure–Activity Resource (CSAR) recently held its first blinded exercise based on data provided by Abbott, Vertex, and colleagues at the University of Michigan, Ann Arbor. A total of 20 research groups submitted results for the benchmark exercise where the goal was to compare different improvements for pose prediction, enrichment, and relative ranking of congeneric series of compounds. The exercise was built around blinded high-quality experimental data from four protein targets: LpxC, Urokinase, Chk1, and Erk2. Pose prediction proved to be the most straightforward task, and most methods were able to successfully reproduce binding poses when the crystal structure employed was co-crystallized with a ligand from the same chemical series. Multiple evaluation metrics were examined, and we found that RMSD and native contact metrics together provide a robust evaluation of the predicted poses. It was notable that most scoring functions underpredicted contacts between the hetero atoms (i.e., N, O, S, etc.) of the protein and ligand. Relative ranking was found to be the most difficult area for the methods, but many of the scoring functions were able to properly identify Urokinase actives from the inactives in the series. Lastly, we found that minimizing the protein and correcting histidine tautomeric states positively trended with low RMSD for pose prediction but minimizing the ligand negatively trended. Pregenerated ligand conformations performed better than those that were generated on the fly. Optimizing docking parameters and pretraining with the native ligand had a positive effect on the docking performance as did using restraints, substructure fitting, and shape fitting. Lastly, for both sampling and ranking scoring functions, the use of the empirical scoring function appeared to trend positively with the RMSD. Here, by combining the results of many methods, we hope to provide a statistically relevant evaluation and elucidate specific shortcomings of docking methodology for the community.
The binding site of an antibody is formed between the two variable domains, VH and VL, of its antigen binding fragment (Fab). Understanding how VH and VL orientate with respect to one another is important both for studying the mechanisms of antigen specificity and affinity and improving antibody modelling, docking and engineering. Different VH-VL orientations are commonly described using relative measures such as root-mean-square deviation. Recently, the orientation has also been characterised using the absolute measure of a VH-VL packing angle. However, a single angle cannot fully describe all modes of orientation. Here, we present a method which fully characterises VH-VL orientation in a consistent and absolute sense using five angles (HL, HC1, LC1, HC2 and LC2) and a distance (dc). Additionally, we provide a computational tool, ABangle, to allow the VH-VL orientation for any antibody to be automatically calculated and compared with all other known structures. We compare previous studies and show how the modes of orientation being identified relate to movements of different angles. Thus, we are able to explain why different studies identify different structural clusters and different residues as important. Given this result, we then identify those positions and their residue identities which influence each of the angular measures of orientation. Finally, by analysing VH-VL orientation in bound and unbound forms, we find that antibodies specific for protein antigens are significantly more flexible in their unbound form than antibodies specific for hapten antigens. ABangle is freely available at http://opig.stats.ox.ac.uk/webapps/abangle.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.