The Immune Epitope Database (IEDB, iedb.org) captures experimental data confined in figures, text and tables of the scientific literature, making it freely available and easily searchable to the public. The scope of the IEDB extends across immune epitope data related to all species studied and includes antibody, T cell, and MHC binding contexts associated with infectious, allergic, autoimmune, and transplant related diseases. Having been publicly accessible for >10 years, the recent focus of the IEDB has been improved query and reporting functionality to meet the needs of our users to access and summarize data that continues to grow in quantity and complexity. Here we present an update on our current efforts and future goals.
The task of epitope discovery and vaccine design is increasingly reliant on bioinformatics analytic tools and access to depositories of curated data relevant to immune reactions and specific pathogens. The Immune Epitope Database and Analysis Resource (IEDB) was indeed created to assist biomedical researchers in the development of new vaccines, diagnostics, and therapeutics. The Analysis Resource is freely available to all researchers and provides access to a variety of epitope analysis and prediction tools. The tools include validated and benchmarked methods to predict MHC class I and class II binding. The predictions from these tools can be combined with tools predicting antigen processing, TCR recognition, and B cell epitope prediction. In addition, the resource contains a variety of secondary analysis tools that allow the researcher to calculate epitope conservation, population coverage, and other relevant analytic variables. The researcher involved in vaccine design and epitope discovery will also be interested in accessing experimental published data, relevant to the specific indication of interest. The database component of the IEDB contains a vast amount of experimentally derived epitope data that can be queried through a flexible user interface. The IEDB is linked to other pathogen-specific and immunological database resources.
The Immune Epitope Database Analysis Resource (IEDB-AR, http://tools.iedb.org/) is a companion website to the IEDB that provides computational tools focused on the prediction and analysis of B and T cell epitopes. All of the tools are freely available through the public website and many are also available through a REST API and/or a downloadable command-line tool. A virtual machine image of the entire site is also freely available for non-commercial use and contains most of the tools on the public site. Here, we describe the tools and functionalities that are available in the IEDB-AR, focusing on the 10 new tools that have been added since the last report in the 2012 NAR webserver edition. In addition, many of the tools that were already hosted on the site in 2012 have received updates to newest versions, including NetMHC, NetMHCpan, BepiPred and DiscoTope. Overall, this IEDB-AR update provides a substantial set of updated and novel features for epitope prediction and analysis.
Predicting epitopes recognized by cytotoxic T cells has been a long standing challenge within the field of immuno-and bioinformatics. While reliable predictions of peptide binding are available for most Major Histocompatibility Complex class I (MHCI) alleles, prediction models of T cell receptor (TCR) interactions with MHC class I-peptide complexes remain poor due to the limited amount of available training data. Recent next generation sequencing projects have however generated a considerable amount of data relating TCR sequences with their cognate HLA-peptide complex target. Here, we utilize such data to train a sequence-based predictor of the interaction between TCRs and peptides presented by the most common human MHCI allele, HLA-A*02:01. Our model is based on convolutional neural networks, which are especially designed to meet the challenges posed by the large length variations of TCRs. We show that such a sequence-based model allows for the identification of TCRs binding a given cognate peptide-MHC target out of a large pool of non-binding TCRs.
Protein structures are classically described in terms of secondary structures. Even if the regular secondary structures have relevant physical meaning, their recognition from atomic coordinates has some important limitations such as uncertainties in the assignment of boundaries of helical and β-strand regions. Further, on an average about 50% of all residues are assigned to an irregular state, i.e., the coil. Thus different research teams have focused on abstracting conformation of protein backbone in the localized short stretches. Using different geometric measures, local stretches in protein structures are clustered in a chosen number of states. A prototype representative of the local structures in each cluster is generally defined. These libraries of local structures prototypes are named as “structural alphabets”. We have developed a structural alphabet, named Protein Blocks, not only to approximate the protein structure, but also to predict them from sequence. Since its development, we and other teams have explored numerous new research fields using this structural alphabet. We review here some of the most interesting applications.
B-cells can neutralize pathogenic molecules by targeting them with extreme specificity using receptors secreted or expressed on their surface (antibodies). This is achieved via molecular interactions between the paratope (i.e., the antibody residues involved in the binding) and the interacting region (epitope) of its target molecule (antigen). Discerning the rules that define this specificity would have profound implications for our understanding of humoral immunogenicity and its applications. The aim of this work is to produce improved, antibody-specific epitope predictions by exploiting features derived from the antigens and their cognate antibodies structures, and combining them using statistical and machine learning algorithms. We have identified several geometric and physicochemical features that are correlated in interacting paratopes and epitopes, used them to develop a Monte Carlo algorithm to generate putative epitopes-paratope pairs, and train a machine-learning model to score them. We show that, by including the structural and physicochemical properties of the paratope, we improve the prediction of the target of a given B-cell receptor. Moreover, we demonstrate a gain in predictive power both in terms of identifying the cognate antigen target for a given antibody and the antibody target for a given antigen, exceeding the results of other available tools.
The adaptive immune system in vertebrates has evolved to recognize non-self antigens, such as proteins expressed by infectious agents and mutated cancer cells. T cells play an important role in antigen recognition by expressing a diverse repertoire of antigen-specific receptors, which bind epitopes to mount targeted immune responses. Recent advances in high-throughput sequencing have enabled the routine generation of T-cell receptor (TCR) repertoire data. Identifying the specific epitopes targeted by different TCRs in these data would be valuable. To accomplish that, we took advantage of the ever-increasing number of TCRs with known epitope specificity curated in the Immune Epitope Database (IEDB) since 2004. We compared seven metrics of sequence similarity to determine their power to predict if two TCRs have the same epitope specificity. We found that a comprehensive k-mer matching approach produced the best results, which we have implemented into TCRMatch, an openly accessible tool (http://tools.iedb.org/tcrmatch/) that takes TCR β-chain CDR3 sequences as an input, identifies TCRs with a match in the IEDB, and reports the specificity of each match. We anticipate that this tool will provide new insights into T cell responses captured in receptor repertoire and single cell sequencing experiments and will facilitate the development of new strategies for monitoring and treatment of infectious, allergic, and autoimmune diseases, as well as cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.