Many bioactive peptides demonstrated therapeutic effects over-complicated diseases, such as antiviral, antibacterial, anticancer, etc. Similar to the generating de novo chemical compounds, with the accumulated bioactive peptides as a training set, it is possible to generate abundant potential bioactive peptides with deep learning. Such techniques would be significant for drug development since peptides are much easier and cheaper to synthesize than compounds. However, there are very few deep learning-based peptide generating models. Here, we have created an LSTM model (named LSTM_Pep) to generate de novo peptides and finetune learning to generate de novo peptides with certain potential therapeutic effects. Remarkably, the Antimicrobial Peptide Database has fully utilized in this work to generate various kinds of potential active de novo peptide. We proposed a pipeline for screening those generated peptides for a given target, and use Main protease of SARS-COV-2 as concept-of-proof example. Moreover, we have developed a deep learning-based protein-peptide prediction model (named DeepPep) for fast screening the generated peptides for the given targets. Together with the generating model, we have demonstrated iteratively finetune training, generating and screening peptides for higher predicted binding affinity peptides can be achieved. Our work sheds light on to the development of deep learning-based methods and pipelines to effectively generating and getting bioactive peptides with a specific therapeutic effect, and showcases how artificial intelligence can help discover de novo bioactive peptides that can bind to a particular target.
The core of large-scale drug virtual screening is to accurately and efficiently select the binders with high affinity from large libraries of small molecules in which non-binders are usually dominant. The protein pocket, ligand spatial information, and residue types/atom types play a pivotal role in binding affinity. Here we used the pocket residues or ligand atoms as nodes and constructed edges with the neighboring information to comprehensively represent the protein pocket or ligand information. Moreover, we find that the model with pre-trained molecular vectors performs better than the onehot representation. The main advantage of DeepBindGCN is that it is non-dependent on docking conformation and concisely keeps the spatial information and physical-chemical feature. Notably, the DeepBindGCN_BC has high precision in many DUD.E datasets, and DeepBindGCN_RG achieve a very low RMSE value in most DUD.E datasets. Using TIPE3 and PD-L1 dimer as proof-of-concept examples, we proposed a screening pipeline by integrating DeepBindGCN_BC, DeepBindGCN_RG, and other methods to identify strong binding affinity compounds. In addition, a DeepBindGCN_RG_x model has been used for comparing performance with other methods in PDBbind v.2016 and v.2013 core set. It is the first time that a non-complex dependent model achieves an RMSE value of 1.3843 and Pearson-R value of 0.7719 in the PDBbind v.2016 core set, showing comparable prediction power with the state-of-the-art affinity prediction models that rely upon the 3D complex. Our DeepBindGCN provides a powerful tool to predict the protein-ligand interaction and can be used in many important large-scale virtual screening application scenarios.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.