Neuropeptides
play pivotal roles in different physiological processes
and are related to different kinds of diseases. Identification of
neuropeptides is of great benefit for studying the mechanism of these
physiological processes and the treatment of neurological disorders.
Several state-of-the-art neuropeptide predictors have been developed
by using a two-layer stacking ensemble algorithm. Although the two-layer
stacking ensemble algorithm can improve the feature representability,
these models are complex, which are not as efficient as the models
based on one classifier. In this study, we proposed a new model, NeuroPpred-SVM,
to predict neuropeptides based on the embeddings of Bidirectional
Encoder Representations from Transformers and other sequential features
by using a support vector machine (SVM). The experimental results
indicate that our model achieved a cross-validation area under the
receiver operating characteristic (AUROC) curve of 0.969 on the training
data set and an AUROC of 0.966 on the independent test set. By comparing
our model with the other four state-of-the-art models including NeuroPIpred,
PredNeuroP, NeuroPpred-Fuse, and NeuroPpred-FRL on the independent
test set, our model achieved the highest AUROC, Matthews correlation
coefficient, accuracy, and specificity, which indicate that our model
outperforms the existing models. We believed that NeuroPpred-SVM could
be a useful tool for identifying neuropeptides with high accuracy
and low cost. The data sets and Python code are available at .