2015
DOI: 10.1016/j.procs.2015.04.217
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Technique of Feature Extraction with Dual Similarity Measures for Protein Sequence Classification

Abstract: In this article, a novel approach for extracting features from protein sequences is proposed. This approach extracts only six features corresponding to each protein sequence. These features are computed by globally considering the probabilities of occurrences of the amino acids in different positions within the superfamily which locally belongs to the six exchange groups. Then, these features are used as an input to the Neural Network formed by Boolean-Like Training Algorithm (BLTA). The BLTA is used to classi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
15
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 8 publications
(15 citation statements)
references
References 9 publications
(21 reference statements)
0
15
0
Order By: Relevance
“…This is usually circumvented by aligning the sequences 26 or, in cases where aligning is impossible, with feature extraction, i.e. representing the sequences as a feature vector reflecting their properties [22][23][24][25][27][28][29] . Unfortunately, the resulting feature vectors are inherently biased by the method of feature extraction used 30 .…”
mentioning
confidence: 99%
“…This is usually circumvented by aligning the sequences 26 or, in cases where aligning is impossible, with feature extraction, i.e. representing the sequences as a feature vector reflecting their properties [22][23][24][25][27][28][29] . Unfortunately, the resulting feature vectors are inherently biased by the method of feature extraction used 30 .…”
mentioning
confidence: 99%
“…This implies that how these protein sequences can be represented in terms of feature vectors so that these feature vectors can be applied as an input to the clustering algorithm. For this purpose, we have used the encoding technique presented in [53] that entails the extraction of six features corresponding to each protein sequence.…”
Section: B Datasetsmentioning
confidence: 99%
“…Gupta et al [28] used the general version of Chou [29] pseudo amino acid composition, which is a sixtydimensional numerical feature vector of protein sequences, to develop an alignment-free approach for finding similarity across protein sequences. Bharill et al [30] developed an approach to extract six-dimensional numerical feature vectors from a protein sequence. Likewise, many feature extraction techniques [23,20,24,31,25,30,28] have been introduced in the past, but, none of them is scalable.…”
Section: Introductionmentioning
confidence: 99%
“…Bharill et al [30] developed an approach to extract six-dimensional numerical feature vectors from a protein sequence. Likewise, many feature extraction techniques [23,20,24,31,25,30,28] have been introduced in the past, but, none of them is scalable. A scalable approach for selecting statistically relevant characteristics from a large sequence is required.…”
Section: Introductionmentioning
confidence: 99%