2019
DOI: 10.1016/j.knosys.2018.10.007
|View full text |Cite
|
Sign up to set email alerts
|

Predicting protein structural classes for low-similarity sequences by evaluating different features

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
89
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 188 publications
(92 citation statements)
references
References 100 publications
0
89
0
Order By: Relevance
“…Besides, previous predictors relied heavily on the features derived from third-party tools, such as position-specific score matrix (PSSM) [23], Z-coordinate, secondary structure [24], and so on. Although these features contribute to the improvement of the predictor performance [25][26][27], their weakness cannot be ignored. On the one hand, using these third-party tool-derived features will make the predictor slow and may lead to uncontrollable failure.…”
Section: Methodsmentioning
confidence: 99%
“…Besides, previous predictors relied heavily on the features derived from third-party tools, such as position-specific score matrix (PSSM) [23], Z-coordinate, secondary structure [24], and so on. Although these features contribute to the improvement of the predictor performance [25][26][27], their weakness cannot be ignored. On the one hand, using these third-party tool-derived features will make the predictor slow and may lead to uncontrollable failure.…”
Section: Methodsmentioning
confidence: 99%
“…PSI-BLAST was used to assess the PSSM for each sequence based on sequences in the non-redundant Swiss-PROT database that share significant similarity, with three iterations and an e-value threshold of 0.0001 (Bhagwat and Aravind, 2007;Zhu et al, 2019). The raw PSSMs are n × 20 matrices; n rows indicate the query protein residues with n being the length of the protein sequence and 20 columns represent the 20 standard amino acids that may exist in the related protein sequences.…”
Section: Data Setsmentioning
confidence: 99%
“…Selecting representative features is a crucial step because they directly determine prediction performance [41,42]. Seven sequence-based feature extraction methods were used: local structural entropy (LSE), NetSurfP, DisEMBL, overall amino acid composition (OAAC), dipeptide composition, PSSM profiles, and physicochemical properties.…”
Section: Feature Extractionmentioning
confidence: 99%