2003
DOI: 10.1093/protein/gzg044
|View full text |Cite
|
Sign up to set email alerts
|

Reduction of protein sequence complexity by residue grouping

Abstract: It is well known that there are some similarities among various naturally occurring amino acids. Thus, the complexity in protein systems could be reduced by sorting these amino acids with similarities into groups and then protein sequences can be simplified by reduced alphabets. This paper discusses how to group similar amino acids and whether there is a minimal amino acid alphabet by which proteins can be folded. Various reduced alphabets are obtained by reserving the maximal information for the simplified pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

6
89
1

Year Published

2006
2006
2016
2016

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 132 publications
(99 citation statements)
references
References 14 publications
6
89
1
Order By: Relevance
“…Through the maximization of S, a series of optimal groupings could be obtained (as shown in Figure 5). It is interesting that the HP grouping determined with BLOSUM62 matrix is same as that from the interaction [102]. This consistency further confirms that the HP grouping is a fundamental feature of protein systems, and also demonstrates the correctness of these grouping methods which precisely characteristics of protein systems.…”
Section: Simplification Based On Information In Sequencesupporting
confidence: 64%
See 2 more Smart Citations
“…Through the maximization of S, a series of optimal groupings could be obtained (as shown in Figure 5). It is interesting that the HP grouping determined with BLOSUM62 matrix is same as that from the interaction [102]. This consistency further confirms that the HP grouping is a fundamental feature of protein systems, and also demonstrates the correctness of these grouping methods which precisely characteristics of protein systems.…”
Section: Simplification Based On Information In Sequencesupporting
confidence: 64%
“…The simplified groups which maximizes the information in simplified sequences. Note: Generated based on the data from [102].…”
Section: Simplification Based On Features Of Protein Systemsmentioning
confidence: 99%
See 1 more Smart Citation
“…Usually, iterative selection and combination can greatly reduce the redundancy of the sequence space (45)(46)(47)(48)(49). Based on the properties of all the purified single mutant proteins as to each Cys and Met site and assuming strict additivity, the number of mutants expected to be more active than the wild type in terms of the main activity value is estimated to be 8.9 ϫ 10 7 (ϭ 4 (number of mutants with higher main activity than the wild type at Cys-152) ϫ 2 (at Cys-158), ϫ 4 (at Cys-211), ϫ 3 (at Cys-332), ϫ 6 (at Met-52), ϫ 3 (at Met-65), ϫ 5 (at Met-98), ϫ 6 (at Met-110), ϫ 7 (at Met-213), ϫ 5 (at Met-276), ϫ 7 (at Met-346), and ϫ 7 (at Met 349)).…”
mentioning
confidence: 99%
“…The most familiar scenario for the use of set covers is the case of R as a set of amino acids, so henceforth we concentrate mainly on multiple protein alignments, even though it makes perfect sense for R to be a set of nucleotides as well. Set covers of an amino-acid alphabet have been studied extensively, as for example in [50,20,53,48,28,32,24,36,52,66,19,5,27].…”
Section: Introductionmentioning
confidence: 99%