Cytotoxic T cells are of central importance in the immune system’s response to disease. They recognize defective cells by binding to peptides presented on the cell surface by MHC (major histocompatibility complex) class I molecules. Peptide binding to MHC molecules is the single most selective step in the antigen presentation pathway. On the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has therefore attracted large attention. In the past, predictors of peptide-MHC interaction have in most cases been trained on binding affinity data. Recently an increasing amount of MHC presented peptides identified by mass spectrometry has been published containing information about peptide processing steps in the presentation pathway and the length distribution of naturally presented peptides. Here, we present NetMHCpan-4.0, a method trained on both binding affinity and eluted ligand data leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an increased predictive performance compared to state-of-the-art when it comes to identification of naturally processed ligands, cancer neoantigens, and T cell epitopes.
Motivation: Many biological processes are guided by receptor interactions with linear ligands of variable length. One such receptor is the MHC class I molecule. The length preferences vary depending on the MHC allele, but are generally limited to peptides of length 8-11 amino acids. On this relatively simple system, we developed a sequence alignment method based on artificial neural networks that allows insertions and deletions in the alignment. Results: We show that prediction methods based on alignments that include insertions and deletions have significantly higher performance than methods trained on peptides of single lengths. Also, we illustrate how the location of deletions can aid the interpretation of the modes of binding of the peptide-MHC, as in the case of long peptides bulging out of the MHC groove or protruding at either terminus. Finally, we demonstrate that the method can learn the length profile of different MHC molecules, and quantified the reduction of the experimental effort required to identify potential epitopes using our prediction algorithm. Availability and implementation: The NetMHC-4.0 method for the prediction of peptide-MHC class I binding affinity using gapped sequence alignment is publicly available at:
Major histocompatibility complex class II (MHC-II) molecules are expressed on the surface of professional antigen-presenting cells where they display peptides to T helper cells, which orchestrate the onset and outcome of many host immune responses. Understanding which peptides will be presented by the MHC-II molecule is therefore important for understanding the activation of T helper cells and can be used to identify T-cell epitopes. We here present updated versions of two MHC-II-peptide binding affinity prediction methods, NetMHCII and NetMHCIIpan. These were constructed using an extended data set of quantitative MHC-peptide binding affinity data obtained from the Immune Epitope Database covering HLA-DR, HLA-DQ, HLA-DP and H-2 mouse molecules. We show that training with this extended data set improved the performance for peptide binding predictions for both methods. Both methods are publicly available at www.cbs.dtu.dk/services/NetMHCII-2.3 and www.cbs.dtu.dk/services/NetMHCIIpan-3.2.
Cytotoxic T cells are of central importance in the immune system's response to disease. They recognize defective cells by binding to peptides presented on the cell surface by MHC (major histocompatibility complex) class I molecules. Peptide binding to MHC molecules is the single most selective step in the antigen presentation pathway. On the quest for T cell epitopes, the prediction of peptide binding to MHC molecules has therefore attracted large attention. In the past, predictors of peptide-MHC interaction have in most cases been trained on binding affinity data. Recently an increasing amount of MHC presented peptides identified by mass spectrometry has been published containing information about peptide processing steps in the presentation pathway and the length distribution of naturally presented peptides. Here, we present NetMHCpan-4.0, a method trained on both binding affinity and eluted ligand data leveraging the information from both data types. Large-scale benchmarking of the method demonstrates an increased predictive performance compared to state-of-the-art when it comes to identification of naturally processed ligands, cancer neoantigens, and T cell epitopes.
BackgroundBinding of peptides to MHC class I molecules (MHC-I) is essential for antigen presentation to cytotoxic T-cells.ResultsHere, we demonstrate how a simple alignment step allowing insertions and deletions in a pan-specific MHC-I binding machine-learning model enables combining information across both multiple MHC molecules and peptide lengths. This pan-allele/pan-length algorithm significantly outperforms state-of-the-art methods, and captures differences in the length profile of binders to different MHC molecules leading to increased accuracy for ligand identification. Using this model, we demonstrate that percentile ranks in contrast to affinity-based thresholds are optimal for ligand identification due to uniform sampling of the MHC space.ConclusionsWe have developed a neural network-based machine-learning algorithm leveraging information across multiple receptor specificities and ligand length scales, and demonstrated how this approach significantly improves the accuracy for prediction of peptide binding and identification of MHC ligands. The method is available at www.cbs.dtu.dk/services/NetMHCpan-3.0.Electronic supplementary materialThe online version of this article (doi:10.1186/s13073-016-0288-x) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.