B-cell is an essential component of the immune system that plays a vital role in providing the immune response against any pathogenic infection by producing antibodies. Existing methods either predict linear or conformational B-cell epitopes in an antigen. In this study, a single method was developed for predicting both types (linear/conformational) of B-cell epitopes. The dataset used in this study contains 3875 B-cell epitopes and 3996 non-B-cell epitopes, where B-cell epitopes consist of both linear and conformational B-cell epitopes. Our primary analysis indicates that certain residues (like Asp, Glu, Lys, Asn) are more prominent in B-cell epitopes. We developed machine-learning based methods using different types of sequence composition and achieved the highest AUC of 0.80 using dipeptide composition. In addition, models were developed on selected features, but no further improvement was observed. Our similarity-based method implemented using BLAST shows a high probability of correct prediction with poor sensitivity. Finally, we came up with a hybrid model that combine alignment free (dipeptide based random forest model) and alignment-based (BLAST based similarity) model. Our hybrid model attained maximum AUC 0.83 with MCC 0.49 on the independent dataset. Our hybrid model performs better than existing methods on an independent dataset used in this study. All models trained and tested on 80% data using cross-validation technique and final model was evaluated on 20% data called independent or validation dataset. A webserver and standalone package named "CLBTope" has been developed for predicting, designing, and scanning B-cell epitopes in an antigen sequence (https://webs.iiitd.edu.in/raghava/clbtope/).
There are a number of antigens that induce autoimmune response against β-cells, leading to type 1 diabetes mellitus (T1DM). Recently, several antigen-specific immunotherapies have been developed to treat T1DM. Thus, identification of T1DM associated peptides with antigenic regions or epitopes is important for peptide based-therapeutics (e.g. immunotherapeutic). In this study, for the first time, an attempt has been made to develop a method for predicting, designing, and scanning of T1DM associated peptides with high precision. We analysed 815 T1DM associated peptides and observed that these peptides are not associated with a specific class of HLA alleles. Thus, HLA binder prediction methods are not suitable for predicting T1DM associated peptides. First, we developed a similarity/alignment based method using Basic Local Alignment Search Tool and achieved a high probability of correct hits with poor coverage. Second, we developed an alignment-free method using machine learning techniques and got a maximum AUROC of 0.89 using dipeptide composition. Finally, we developed a hybrid method that combines the strength of both alignment free and alignment-based methods and achieves maximum area under the receiver operating characteristic of 0.95 with Matthew’s correlation coefficient of 0.81 on an independent dataset. We developed a web server ‘DMPPred’ and stand-alone server for predicting, designing and scanning T1DM associated peptides (https://webs.iiitd.edu.in/raghava/dmppred/).
There are a number of antigens that induce autoimmune response against beta-cells, leading to Type 1 diabetes mellitus (T1DM). Recently several antigen-specific immunotherapies have been developed to treat T1DM. Thus identification of T1DM associated peptides with antigenic regions or epitopes is important for peptide based-therapeutics (e.g., immunotherapeutic). In this study, for the first time an attempt has been made to develop a method for predicting, designing and scanning of T1DM associated peptides with high precision. We analyzed 815 T1DM associated peptides and observed that these peptides are not associated with a specific class of HLA alleles. Thus, HLA binder prediction methods are not suitable for predicting T1DM associated peptides. Firstly, we developed a similarity/alignment based method using BLAST and achieved a high probability of correct hits with poor coverage. Secondly, we developed an alignment free method using machine learning techniques and got maximum AUROC 0.89 using dipeptide composition. Finally, we developed a hybrid method that combines the strength of both alignment free and alignment based methods and achieve maximum AUROC 0.95 with MCC 0.81 on independent dataset. We developed a webserver DMPPred and standalone server, for predicting, designing and scanning of T1DM associated peptides (https://webs.iiitd.edu.in/raghava/dmppred/).
HLA-DRB1*04:01 is associated with many disease that include sclerosis, arthritis, diabetes and Covid19. Thus, it is important to scan binders of HLA-DRB1*04:01 in an antigen to develop immunotherapy, vaccine and protection against these diseases. One of the major limitations of existing methods for predicting with HLA-DRB1*04:01 binders is that these methods trained on small datasets. This study present a method HLA-DR4Pred2 developed on a large dataset contain 12676 binders and equal number of non-binders. It is an improved version of HLA-DR4Pred, which was trained on a small dataset contain only 576 binders and equal number of binders. All models in this study were trained, optimized and tested on 80% of data called training datasets using five-fold cross-validation; final models were evaluated on 20% of data called validation/independent dataset. A wide range of machine learning techniques have been employed to develop prediction models and achieved maximum AUC of 0.90 and 0.87 on validation dataset using composition and binary profile features respectively. The performance of our composition based model increased from 0.90 to 0.93 when combined with BLAST search. In addition, we also developed our models on alternate or realistic dataset that contain 12676 binders and 86300 non-binders and achieved maximum AUC 0.99. Our method perform better than existing methods when we compare the performance of our best model with performance of existing methods on validation dataset. Finally, we developed standalone and online version of HLA-DR4Pred2 for predicting, designing and virtual scanning of HLA-DRB1*04:01(https://webs.iiitd.edu.in/raghava/hladr4pred2/ ; https://github.com/raghavagps/hladr4pred2).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.