2005
DOI: 10.1002/prot.20356
|View full text |Cite
|
Sign up to set email alerts
|

A new method for identification of protein (sub)families in a set of proteins based on hydropathy distribution in proteins

Abstract: Structural similarity among proteins is reflected in the distribution of hydropathicity along the amino acids in the protein sequence. Similarities in the hydropathy distributions are obvious for homologous proteins within a protein family. They also were observed for proteins with related structures, even when sequence similarities were undetectable. Here we present a novel method that employs the hydropathy distribution in proteins for identification of (sub)families in a set of (homologous) proteins. We rep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2006
2006
2017
2017

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 48 publications
(29 citation statements)
references
References 23 publications
0
29
0
Order By: Relevance
“…The 20 amino acids can be divided into four groups (Pánek et al, 2005), which use the letters L, B, W, and P, respectively, where L includes strong hydrophilic amino acids, B includes strong hydrophobic amino acids, W includes weak hydrophilic or hydrophobic amino acids, and P, G, and C have relatively unique properties, which are divided into a class using the letter P. The information for classification is listed in Table 2. For example, a sequence of protein {MERIKELRDLMSQG} can be converted to {BLLBLLBLLBBWLP} as shown in Table 3.…”
Section: New Feature Extraction Methodsmentioning
confidence: 99%
“…The 20 amino acids can be divided into four groups (Pánek et al, 2005), which use the letters L, B, W, and P, respectively, where L includes strong hydrophilic amino acids, B includes strong hydrophobic amino acids, W includes weak hydrophilic or hydrophobic amino acids, and P, G, and C have relatively unique properties, which are divided into a class using the letter P. The information for classification is listed in Table 2. For example, a sequence of protein {MERIKELRDLMSQG} can be converted to {BLLBLLBLLBBWLP} as shown in Table 3.…”
Section: New Feature Extraction Methodsmentioning
confidence: 99%
“…Instead of dividing the protein sequences into segments and counting the number of hydropathy characters in each segment, the frequencies of the hydropathy blocks occurring in protein sequences are calculated and used to generate fixeddimensional vectors. Compared to the algorithm proposed by Panek et al [15], our technique is simpler and need not estimate the number of segments. Further, our method can be applied to protein sequences of any length.…”
Section: Introductionmentioning
confidence: 94%
“…It is shown that the hydropathy distributions in proteins are useful for recognizing the homologous proteins belonging to the same family. Nevertheless, the method developed by Panek et al [15] requires that: (1) the number of segments must be optimally estimated in advance, otherwise the sequel clustering algorithm will not work well; (2) the length of the protein sequence is not taken into account. Obviously, the longer sequences contain more hydropathy characters compared to the shorter ones.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Base on Panek's scale [15], G is a separate class, R, K, E, H are strongly hydrophilic, S, T are weak hydrophilicity. L and A are strong hydrophobic.…”
Section: The Statistical Analysis Of Sequence Segmentmentioning
confidence: 99%