2021
DOI: 10.1093/nargab/lqab039
|View full text |Cite
|
Sign up to set email alerts
|

A large-scale comparative study on peptide encodings for biomedical classification

Abstract: Owing to the great variety of distinct peptide encodings, working on a biomedical classification task at hand is challenging. Researchers have to determine encodings capable to represent underlying patterns as numerical input for the subsequent machine learning. A general guideline is lacking in the literature, thus, we present here the first large-scale comprehensive study to investigate the performance of a wide range of encodings on multiple datasets from different biomedical domains. For the sake of comple… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
32
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 19 publications
(36 citation statements)
references
References 42 publications
0
32
0
Order By: Relevance
“…Another difference is that PEPred-Suite and Peptipedia use elaborate techniques to encode the peptide sequences, and MultiPep uses a simple one-hot encoding. Many peptide encoding techniques exist and peptide encoding in general is a field that is gaining a lot of attention [ 77 , 78 ]. PEPred-Suite uses adaptive feature representation strategy, where they, among other things, use 10 feature encoding algorithms, which together efficiently capture local and global compositional information and well as position-specific residue information and physiochemical information [ 26 ].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Another difference is that PEPred-Suite and Peptipedia use elaborate techniques to encode the peptide sequences, and MultiPep uses a simple one-hot encoding. Many peptide encoding techniques exist and peptide encoding in general is a field that is gaining a lot of attention [ 77 , 78 ]. PEPred-Suite uses adaptive feature representation strategy, where they, among other things, use 10 feature encoding algorithms, which together efficiently capture local and global compositional information and well as position-specific residue information and physiochemical information [ 26 ].…”
Section: Discussionmentioning
confidence: 99%
“…Peptipedia encodes peptides using representations of physicochemical properties and transforms them using Fourier transforms [ 27 , 79 ]. Although some general selection rules have been suggested, it is difficult to find a single universally optimal peptide encoding technique [ 77 , 78 ]. As it has been found that deep learning models require little encoding for the classification process [ 77 ], we chose to use a simple and, in our opinion, reliable encoding technique where all amino acids are equally similar or dissimilar (one-hot encoding).…”
Section: Discussionmentioning
confidence: 99%
“…For a comprehensive analysis on peptide encodings, Spänig et al . (2021) gathered a variety of datasets from multiple biomedical domains [20].…”
Section: Methodsmentioning
confidence: 99%
“…In the case at least one model is significantly different, we used the Nemenyi test for post-hoc analysis [54]. Refer also to Spänig et al . (2021) for more details [20].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation