2020
DOI: 10.1109/tcbb.2019.2911677
|View full text |Cite
|
Sign up to set email alerts
|

Amino Acid Encoding Methods for Protein Sequences: A Comprehensive Review and Assessment

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
48
0
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 63 publications
(50 citation statements)
references
References 60 publications
0
48
0
1
Order By: Relevance
“…Choosing how to digitally encode amino acids is a crucial step in this context, since it can affect to the overall performance of the models 18 . A comprehensive review and assessment on different amino acid encoding methods 19 shows that position specific scoring matrix (PSSM), an evolution-based position dependent methodology, achieves the best performance on protein secondary structure prediction and protein fold recognition tasks. However, this type of encoding is very consuming computationally 20 and its applicability is limited to proteins with known homologous sequences 19 , which could highly decrease the generalisation capabilities of the predictor for non evolutionary related proteins.…”
Section: Introductionmentioning
confidence: 99%
“…Choosing how to digitally encode amino acids is a crucial step in this context, since it can affect to the overall performance of the models 18 . A comprehensive review and assessment on different amino acid encoding methods 19 shows that position specific scoring matrix (PSSM), an evolution-based position dependent methodology, achieves the best performance on protein secondary structure prediction and protein fold recognition tasks. However, this type of encoding is very consuming computationally 20 and its applicability is limited to proteins with known homologous sequences 19 , which could highly decrease the generalisation capabilities of the predictor for non evolutionary related proteins.…”
Section: Introductionmentioning
confidence: 99%
“…Yet, in this work, proteins were mapped with these methods, and in general, the results were better than some methods. The best results have not been achieved, and the main reason may be that these methods, including Atchley factors are more effective in prediction of secondary structure of proteins [9]. The performance of binary one-hot seems fair.…”
Section: The Performance Comparison By Using the Blstmmentioning
confidence: 99%
“…In the literature, there are limited methods for converting protein sequences into the numbers. In general, BLOSUM62 (BLOcks SUbstitution Matrix), PAM25 (Point Accepted Mutation), hydrophobicity, EIIP (Electron-Ion Interaction Potential) are applied and the performance of family classification is highly depending on the conversion method [7,9]. Recently, deep learning models are actively used in bioinformatics studies and show promising results.…”
Section: Introductionmentioning
confidence: 99%
“…It mainly includes two key techniques which are feature representation and classifier. For feature representation, if the features of peptide sequences are well-extracted, it will be easier to precisely predict the ACPs (Jing et al, 2019 ). At present, some tools in the prediction of ACPs have been developed.…”
Section: Introductionmentioning
confidence: 99%