2005
DOI: 10.1038/nature03991
|View full text |Cite
|
Sign up to set email alerts
|

Evolutionary information for specifying a protein fold

Abstract: Classical studies show that for many proteins, the information required for specifying the tertiary structure is contained in the amino acid sequence. Here, we attempt to define the sequence rules for specifying a protein fold by computationally creating artificial protein sequences using only statistical information encoded in a multiple sequence alignment and no tertiary structure information. Experimental testing of libraries of artificial WW domain sequences shows that a simple statistical energy function … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

17
415
0
9

Year Published

2007
2007
2018
2018

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 396 publications
(441 citation statements)
references
References 48 publications
17
415
0
9
Order By: Relevance
“…Experimental studies show that these networks are involved in the long-range communication pathways of several well studied systems (15)(16)(17). Most recently, the evolutionary information extracted by SCA was sufficient to engineer de novo artificial members of a family of small-protein interaction modules (18,19). These results demonstrate that, to a significant extent, SCA provides an accurate global picture of amino acid interactions in proteins.…”
mentioning
confidence: 62%
“…Experimental studies show that these networks are involved in the long-range communication pathways of several well studied systems (15)(16)(17). Most recently, the evolutionary information extracted by SCA was sufficient to engineer de novo artificial members of a family of small-protein interaction modules (18,19). These results demonstrate that, to a significant extent, SCA provides an accurate global picture of amino acid interactions in proteins.…”
mentioning
confidence: 62%
“…We first review how genetic lesions can lead to altered protein function, which can result in changes to the structure and p e r s p e c t i v e npg An insightful example of how to explore this sequence-function relationship in protein domains was carried out by researchers in the Ranganathan and Yaffe laboratories who, using methods from statistical mechanics, generated synthetic WW domains de novo that maintained fold and function 17,18 . Further supporting a complex sequencefunction relationship, additional studies from the Ranganathan laboratory demonstrated that, in addition to protein architecture described as combinations of modules such as globular domains and linear motifs [19][20][21] , protein domains themselves often have welldefined sectors formed by sparse networks of residues often linking spatially distant regions that contribute cooperatively but unequally to its function 22,23 .…”
Section: From Genomic Lesions To Functional Network Perturbationsmentioning
confidence: 99%
“…In the following, rather than fixing the values forĴ 0 ,Ĵ 1 and calculating the inferred couplings J 0 ,J 1 , we do the opposite. The reason is that the maximization equations are complicated implicit equations over J 0 ,J 1 for givenĴ 0 ,Ĵ 1 and are simpler to solve forĴ 0 ,Ĵ 1 given the values of J 0 ,J 1 . We show in Fig.…”
Section: Efficiency Of Uniform Regularization For Nonuniform Couplmentioning
confidence: 99%
“…When the (Ising or Potts) model includes a large number N of spin variables we show based on numerical simulations that the same phenomenon takes place: Large regularization is necessary, but pseudocounts do a better job than L 2 for strongly heterogeneous networks. We explain why this is so using analytical arguments based on the analysis of the O(m) continuous spin model for large but finite m. MF is exact for this model in the m → ∞ limit, and we show that the optimal pseudocount remains finite in the absence of sampling noise: The optimal penalty is of the order of 1 m , which estimates the deviation of the model with respect to Gaussianity. Moreover, inference is less affected by sampling noise when using large pseudocount than when using a large L 2 -norm.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation