Boltzmann-like Statistics of Protein Architectures

Finkelstein, Alexei V.; Gutin, A. M.; Badretdinov, Azat Ya.

doi:10.1007/978-1-4899-1727-0_1

Cited by 49 publications

(51 citation statements)

References 54 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When extracting a matrix of species-species energies for proteins from the statistics of protein data bank, such as the Miyazawa and Jernigan (MJ) matrix [25], what one obtains is actually B MJ ij = B p ij /T p (see Ref. [27] and Appendix B).…”

Section: Discussionmentioning

confidence: 99%

How accurate must potentials be for successful modeling of protein folding?

Pande

Grosberg

Tanaka

1995

The Journal of Chemical Physics

View full text Add to dashboard Cite

Protein sequences are believed to have been selected to provide the stability of, and reliable renaturation to, an encoded unique spatial fold. In recently proposed theoretical schemes, this selection is modeled as "minimal frustration," or "optimal energy" of the desirable target conformation over all possible sequences, such that the "design" of the sequence is governed by the interactions between monomers. With replica mean field theory, we examine the possibility to reconstruct the renaturation, or freezing transition, of the "designed" heteropolymer given the inevitable errors in the determination of interaction energies, that is, the difference between sets (matrices) of interactions governing chain design and conformations, respectively. We find that the possibility of folding to the designed conformation is controlled by the correlations of the elements of the design and renaturation interaction matrices; unlike random heteropolymers, the ground state of designed heteropolymers is sufficiently stable, such that even a substantial error in the interaction energy should still yield correct renaturation.

show abstract

Section: Discussionmentioning

confidence: 99%

How accurate must potentials be for successful modeling of protein folding?

Pande

Grosberg

Tanaka

1995

The Journal of Chemical Physics

View full text Add to dashboard Cite

show abstract

“…This concern has led us (and others) to look at how many sequences correspond to different possible structures and how this number could affect their relative distribution among biological proteins [1][2][3][4][5][6] as well as their thermodynamic properties. 3,5,[7][8][9][10][11][12][13] Studies such as these (with a few exceptions) have had two major limitations. Most of these studies on protein models examine the nature of the mapping of sequence to structure and how this is affected by the details of the model.…”

Section: Introductionmentioning

confidence: 99%

Evolution of functionality in lattice proteins

Williams

Pollock

Goldstein

2001

Journal of Molecular Graphics and Modelling

View full text Add to dashboard Cite

show abstract

“…We present evidence that the size of individual gene families are influenced not only by the designability of the structure, but also by evolutionary history, e.g., the amount of time the gene family was in existence. We further show that our observed statistical correlation between gene family size and contact density of the structure is valid on many levels of evolutionary divergence, i.e., not only for closely related sequence, but also for less-related fold and superfamily levels of homology.Gene family and domain-fold family sizes are known to vary widely (Finkelstein and Ptitsyn 1987;Finkelstein et al 1995;Orengo et al 1999;Teichmann et al 1999;Yanai et al 2000;Vitkup et al 2001;Koonin et al 2002)-from orphans (families that have only a single member) to considerably populated sets of far-diverged homologs. The observed variability in the number and divergence of gene family members raises many questions, e.g., which genetic mechanisms and evolutionary dynamics could have led to the observed unevenness?…”

mentioning

confidence: 99%

Protein structure and evolutionary history determine sequence space topology

Shakhnovich

Deeds²,

DeLisi³

et al. 2005

Genome Res.

View full text Add to dashboard Cite

Understanding the observed variability in the number of homologs of a gene is a very important unsolved problem that has broad implications for research into coevolution of structure and function, gene duplication, pseudogene formation, and possibly for emerging diseases. Here, we attempt to define and elucidate some possible causes behind the observed irregularity in sequence space. We present evidence that sequence variability and functional diversity of a gene or fold family is influenced by quantifiable characteristics of the protein structure. These characteristics reflect the structural potential for sequence plasticity, i.e., the ability to accept mutation without losing thermodynamic stability. We identify a structural feature of a protein domain-contact density-that serves as a determinant of entropy in sequence space, i.e., the ability of a protein to accept mutations without destroying the fold (also known as fold designability). We show that (log) of average gene family size exhibits statistical correlation (R 2 > 0.9.) with contact density of its three-dimensional structure. We present evidence that the size of individual gene families are influenced not only by the designability of the structure, but also by evolutionary history, e.g., the amount of time the gene family was in existence. We further show that our observed statistical correlation between gene family size and contact density of the structure is valid on many levels of evolutionary divergence, i.e., not only for closely related sequence, but also for less-related fold and superfamily levels of homology.Gene family and domain-fold family sizes are known to vary widely (Finkelstein and Ptitsyn 1987;Finkelstein et al. 1995;Orengo et al. 1999;Teichmann et al. 1999;Yanai et al. 2000;Vitkup et al. 2001;Koonin et al. 2002)-from orphans (families that have only a single member) to considerably populated sets of far-diverged homologs. The observed variability in the number and divergence of gene family members raises many questions, e.g., which genetic mechanisms and evolutionary dynamics could have led to the observed unevenness? Evolutionary biologists have proposed models designed to explain these size distributions (which often follow power laws) (Yanai et al. 2000;Dokholyan et al. 2002;Koonin et al. 2002;Deeds et al. 2003), while assuming no inherent physical differences between gene families from the outset (Huynen and van Nimwegen 1998;Qian et al. 2001;Dokholyan et al. 2002;Koonin et al. 2002). However, many of these models are overly abstract to adequately explain family size distributions in a constructive manner that relate specific features of gene families with their reported size. Neither do these models provide explicit insights into the mechanistic details that might explain observed differences. On the other hand, some researchers have hypothesized that the heterogeneity in family size is due to an underlying distribution of biological or physical properties (Finkelstein et al. 1995;Govindarajan and Goldstein 1996;Li et al. 1996...

show abstract

Boltzmann-like Statistics of Protein Architectures

Cited by 49 publications

References 54 publications

How accurate must potentials be for successful modeling of protein folding?

How accurate must potentials be for successful modeling of protein folding?

Evolution of functionality in lattice proteins

Protein structure and evolutionary history determine sequence space topology

Contact Info

Product

Resources

About