Antimicrobial peptides (AMPs), a crucial part of the innate immune system, have been exploited as promising candidates for antibacterial agents. Many researchers have been devoting their efforts to develop novel AMPs in recent decades. In this term, many computational approaches have been developed to identify potential AMPs accurately. However, finding peptides specific to a particular bacterial species is challenging. Streptococcus mutans is a pathogen with an apparent cariogenic effect, and it is of great significance to study AMP that inhibit S. mutans for the prevention and treatment of caries. In this study, we proposed a sequence‐based machine learning model, namely iASMP, to exactly identify potential anti‐S. mutans peptides (ASMPs). After collecting ASMPs, the performances of models were compared by utilizing multiple feature descriptors and different classification algorithms. Among the baseline predictors, the model integrating the extra trees (ET) algorithm and the hybrid features exhibited optimal results. The feature selection method was utilized to remove redundant feature information to improve the model performance further. Finally, the proposed model achieved the maximum accuracy (ACC) of 0.962 on the training dataset and performed on the testing dataset with an ACC of 0.750. The results demonstrated that iASMP had an excellent predictive performance and was suitable for identifying potential ASMP. Furthermore, we also visualized the selected features and rationally explained the impact of individual features on the model output.
Although peptides are regarded as ideal therapeutic agents, only a small proportion of the marketed drugs are peptides. In the past decade, pharmacists have paid great attention to the development of peptide therapeutics. Except a few approved chemically/rationally designed peptides, most attempts failed due to unsatisfactory efficacy or safety. Luckily, computation methods, such as artificial intelligence, have been utilized to accelerate the discovery of therapeutic peptides by predicting the activity, toxicity, and absorption, distribution, metabolism, and excretion of polypeptides. Usually, a specific biological activity of a peptide could be accurately determined by an interest-oriented binary classification constructed of a positive set and another unexperimentally validated negative set regardless of other characteristics, which suggests that it could be challenging to realize the comprehensive evaluation of the research object in the early stage of drug research and development. Herein, we proposed an integrated method (GM-Pep) that contained a conditional variational autoencoder model (CVAE) and a positive sample training multiclassifier (Deep-Multiclassifier) to effectively generate a single bioactive peptide sequence without toxicity and referential side effects. The results showed that our Deep-Multiclassifier model gave a sequence accuracy of up to 96.41% [toxicity (94.48%), antifungal (96.58%), antihypertensive (97.18%), and antibacterial (96.91%), respectively]. The properties of Deep-Multiclassifier and CVAE were validated through 12 first synthesized antibacterial peptides or compared to random peptides. The source code and data sets are available at https://github.com/TimothyChen225/GM-Pep.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.