Colombo. Selecting sequences that fold into a defined 3D structure: A new approach for protein design based on molecular dynamics and energetics. Biophysical Chemistry, Elsevier, 2009, 146 (2-3) This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
A C C E P T E D M A N U S C R I P T ACCEPTED MANUSCRIPT
A C C E P T E D M A N U S C R I P T ACCEPTED MANUSCRIPT2
AbstractThe problem of finding amino acid sequences able to fold into a defined three-dimensional (3D) structure is at the basis of successful protein design efforts.Herein, we present the results of the application of a novel, all-atom molecular dynamics based, energy decomposition approach to the selection of sequences able to fold into a given 3D conformation. First, the energy decomposition approach is applied to natural sequences associated to a well-defined structure to identify the principal energetic coupling interactions necessary to stabilize it, defining the specific energetic signature for the fold. Then, several different sequences are threaded on the defined 3D structure and only those sequences whose energetic signature (pattern) is close to that of the natural sequence, according to a similarity criterion, are selected as able to populate the specific fold.Furthermore, it is possible to evaluate the fitness of a certain sequence for a fold by combining the information provided by the energetic signature to that contained in the contact map, which recapitulates the fold topology. The results show that the better fit between the energetic properties of a sequence and the topology corresponds to a better stabilization of the protein fold by that sequence.We applied this approach to a library of natural and artificial WW domain sequences, previously developed by the Ranganathan group, containing sequences that are experimentally known to be able and unable to fold into native structures. The results show that our approach can correctly identify 70% of the sequences known to populate the typical WW domain fold.