2023
DOI: 10.1021/acs.jproteome.2c00363
|View full text |Cite
|
Sign up to set email alerts
|

NeuroPpred-SVM: A New Model for Predicting Neuropeptides Based on Embeddings of BERT

Abstract: Neuropeptides play pivotal roles in different physiological processes and are related to different kinds of diseases. Identification of neuropeptides is of great benefit for studying the mechanism of these physiological processes and the treatment of neurological disorders. Several state-of-the-art neuropeptide predictors have been developed by using a two-layer stacking ensemble algorithm. Although the two-layer stacking ensemble algorithm can improve the feature representability, these models are complex, wh… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 52 publications
0
2
0
Order By: Relevance
“…For each matrix, a common treatment is to calculate the average of the embeddings of all residues to derive the features. 47 Finally, the local representation of the protein in ProteinBERT 44 is processed into a 1562-D vector. The ProtBert 45 model is pretrained with protein sequences derived from the Uniref100 (https://www.uniprot.org/downloads) and also produces a matrix to represent one sequence.…”
Section: ■ Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…For each matrix, a common treatment is to calculate the average of the embeddings of all residues to derive the features. 47 Finally, the local representation of the protein in ProteinBERT 44 is processed into a 1562-D vector. The ProtBert 45 model is pretrained with protein sequences derived from the Uniref100 (https://www.uniprot.org/downloads) and also produces a matrix to represent one sequence.…”
Section: ■ Resultsmentioning
confidence: 99%
“…The local representation takes the form of a matrix in which each row signifies the embedding of a residue, while the global representation is a 15599-D vector used as the representation for the entire sequence. For each matrix, a common treatment is to calculate the average of the embeddings of all residues to derive the features . Finally, the local representation of the protein in ProteinBERT is processed into a 1562-D vector.…”
Section: Resultsmentioning
confidence: 99%