A major revival in the use of classical electrostatics as an approach to the study of charged and polar molecules in aqueous solution has been made possible through the development of fast numerical and computational methods to solve the Poisson-Boltzmann equation for solute molecules that have complex shapes and charge distributions. Graphical visualization of the calculated electrostatic potentials generated by proteins and nucleic acids has revealed insights into the role of electrostatic interactions in a wide range of biological phenomena. Classical electrostatics has also proved to be successful quantitative tool yielding accurate descriptions of electrical potentials, diffusion limited processes, pH-dependent properties of proteins, ionic strength-dependent phenomena, and the solvation free energies of organic molecules.
The application of all-atom force fields (and explicit or implicit solvent models) to protein homology-modeling tasks such as side-chain and loop prediction remains challenging both because of the expense of the individual energy calculations and because of the difficulty of sampling the rugged all-atom energy surface. Here we address this challenge for the problem of loop prediction through the development of numerous new algorithms, with an emphasis on multiscale and hierarchical techniques. As a first step in evaluating the performance of our loop prediction algorithm, we have applied it to the problem of reconstructing loops in native structures; we also explicitly include crystal packing to provide a fair comparison with crystal structures. In brief, large numbers of loops are generated by using a dihedral angle-based buildup procedure followed by iterative cycles of clustering, side-chain optimization, and complete energy minimization of selected loop structures. We evaluate this method by using the largest test set yet used for validation of a loop prediction method, with a total of 833 loops ranging from 4 to 12 residues in length. Average/median backbone rootmean-square deviations (RMSDs) to the native structures (superimposing the body of the protein, not the loop itself) are 0.42/0.24 Å for 5 residue loops, 1.00/0.44 Å for 8 residue loops, and 2.47/1.83 Å for 11 residue loops. Median RMSDs are substantially lower than the averages because of a small number of outliers; the causes of these failures are examined in some detail, and many can be attributed to errors in assignment of protonation states of titratable residues, omission of ligands from the simulation, and, in a few cases, probable errors in the experimentally determined structures. When these obvious problems in the data sets are filtered out, average RMSDs to the native structures improve to 0.43 Å for 5 residue loops, 0.84 Å for 8 residue loops, and 1.63 Å for 11 residue loops. In the vast majority of cases, the method locates energy minima that are lower than or equal to that of the minimized native loop, thus indicating that sampling rarely limits prediction accuracy. The overall results are, to our knowledge, the best reported to date, and we attribute this success to the combination of an accurate all-atom energy function, efficient methods for loop buildup and side-chain optimization, and, especially for the longer loops, the hierarchical refinement protocol.
The recognition of specific DNA sequences by proteins is thought to depend on two types of mechanisms: one that involves the formation of hydrogen bonds with specific bases, primarily in the major groove, and one involving sequence-dependent deformations of the DNA helix. By comprehensively analyzing the three dimensional structures of protein-DNA complexes, we show that the binding of arginines to narrow minor grooves is a widely used mode for protein-DNA recognition. This readout mechanism exploits the phenomenon that narrow minor grooves strongly enhance the negative electrostatic potential of the DNA. The nucleosome core particle offers a striking example of this effect. Minor groove narrowing is often associated with the presence of A-tracts, AT-rich sequences that exclude the flexible TpA step. These findings suggest that the ability to detect local variations in DNA shape and electrostatic potential is a general mechanism that enables proteins to use information in the minor groove, which otherwise offers few opportunities for the formation of base-specific hydrogen bonds, to achieve DNA binding specificity.
We present a numerical method for calculating the electrostatic potential of molecules in solution, using the linearized Poisson-Boltzmann equation. The emphasis in this work is on applications to biological macromolecules. The accuracy of the method is assessed by comparisons with analytic solutions for the case of a single charge in a dielectric sphere (Tanford-Kirkwood theory), which serves as a model for a macromolecule. We find that the solutions are generally accurate to within 5%. Larger errors occur close to the charge and the dielectric boundary, but the maximum error found at ion-bonding distance (3 A) from a charge close to the boundary (1 A deep) is only -15%. Several algorithmic improvements, described here, contribute to the accuracy of the method. The programs involved compose a coherent software package, called Del Phi, which goes from a Brookhaven Protein Data Bank format file to calculated electrostatic fields.
Specific interactions between proteins and DNA are fundamental to many biological processes. In this review, we provide a revised view of protein-DNA interactions that emphasizes the importance of the three-dimensional structures of both macromolecules. We divide protein-DNA interactions into two categories: those where the protein recognizes the unique chemical signatures of the DNA bases (base readout) and those where the protein recognizes a sequence-dependent DNA shape (shape readout). We further divide base readout into those interactions that occur in the major groove from those that occur in the minor groove. Analogously, the readout of DNA shape is subdivided into global shape recognition, for example when the DNA helix exhibits an overall bend, and local shape recognition, for example when a base pair step is kinked or when a region of the minor groove is narrow. Based on the >1500 structures of protein-DNA complexes now available in the Protein Data Base, we argue that individual DNA binding proteins combine multiple readout mechanisms to achieve DNA binding specificity. Specificity that distinguishes between families frequently involves base readout in the major groove while shape readout is often exploited for higher resolution specificity, to distinguish between members within the same DNA-binding protein family.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.