Methods for protein modeling and design advanced rapidly in recent years. At the heart of these computational methods is an energy function that calculates the free energy of the system. Many of these functions were also developed to estimate the consequence of mutation on protein stability or binding affinity. In the current study, we chose six different methods that were previously reported as being able to predict the change in protein stability (DeltaDeltaG) upon mutation: CC/PBSA, EGAD, FoldX, I-Mutant2.0, Rosetta and Hunter. We evaluated their performance on a large set of 2156 single mutations, avoiding for each program the mutations used for training. The correlation coefficients between experimental and predicted DeltaDeltaG values were in the range of 0.59 for the best and 0.26 for the worst performing method. All the tested computational methods showed a correct trend in their predictions, but failed in providing the precise values. This is not due to lack in precision of the experimental data, which showed a correlation coefficient of 0.86 between different measurements. Combining the methods did not significantly improve prediction accuracy compared to a single method. These results suggest that there is still room for improvement, which is crucial if we want forcefields to perform better in their various tasks.
Next-generation sequencing technology has enabled the detection of rare genetic or somatic mutations and contributed to our understanding of disease progression and evolution. However, many next-generation sequencing technologies first rely on DNA amplification, via the Polymerase Chain Reaction (PCR), as part of sample preparation workflows. Mistakes made during PCR appear in sequencing data and contribute to false mutations that can ultimately confound genetic analysis. In this report, a single-molecule sequencing assay was used to comprehensively catalog the different types of errors introduced during PCR, including polymerase misincorporation, structure-induced template-switching, PCR-mediated recombination and DNA damage. In addition to well-characterized polymerase base substitution errors, other sources of error were found to be equally prevalent. PCR-mediated recombination by Taq polymerase was observed at the single-molecule level, and surprisingly found to occur as frequently as polymerase base substitution errors, suggesting it may be an underappreciated source of error for multiplex amplification reactions. Inverted repeat structural elements in lacZ caused polymerase template-switching between the top and bottom strands during replication and the frequency of these events were measured for different polymerases. For very accurate polymerases, DNA damage introduced during temperature cycling, and not polymerase base substitution errors, appeared to be the major contributor toward mutations occurring in amplification products. In total, we analyzed PCR products at the single-molecule level and present here a more complete picture of the types of mistakes that occur during DNA amplification.
The CAPRI and CASP prediction experiments have demonstrated the power of community wide tests of methodology in assessing the current state of the art and spurring progress in the very challenging areas of protein docking and structure prediction. We sought to bring the power of community wide experiments to bear on a very challenging protein design problem that provides a complementary but equally fundamental test of current understanding of protein-binding thermodynamics. We have generated a number of designed protein-protein interfaces with very favorable computed binding energies but which do not appear to be formed in experiments, suggesting there may be important physical chemistry missing in the energy calculations. 28 research groups took up the challenge of determining what is missing: we provided structures of 87 designed complexes and 120 naturally occurring complexes and asked participants to identify energetic contributions and/or structural features that distinguish between the two sets. The community found that electrostatics and solvation terms partially distinguish the designs from the natural complexes, largely due to the non-polar character of the designed interactions. Beyond this polarity difference, the community found that the designed binding surfaces were on average structurally less embedded in the designed monomers, suggesting that backbone conformational rigidity at the designed surface is important for realization of the designed function. These results can be used to improve computational design strategies, but there is still much to be learned; for example, one designed complex, which does form in experiments, was classified by all metrics as a non-binder.
We describe a suite of SPACE tools for analysis and prediction of structures of biomolecules and their complexes. LPC/CSU software provides a common definition of inter-atomic contacts and complementarity of contacting surfaces to analyze protein structure and complexes. In the current version of LPC/CSU, analyses of water molecules and nucleic acids have been added, together with improved and expanded visualization options using Chime or Java based Jmol. The SPACE suite includes servers and programs for: structural analysis of point mutations (MutaProt); side chain modeling based on surface complementarity (SCCOMP); building a crystal environment and analysis of crystal contacts (CryCo); construction and analysis of protein contact maps (CMA) and molecular docking software (LIGIN). The SPACE suite is accessed at .
Selective dimerization of the basic-region leucine-zipper (bZIP) transcription factors presents a vivid example of how a high degree of interaction specificity can be achieved within a family of structurally similar proteins. The coiled-coil motif that mediates homo- or hetero-dimerization of the bZIP proteins has been intensively studied, and a variety of methods have been proposed to predict these interactions from sequence data. In this work, we used a large quantitative set of 4,549 bZIP coiled-coil interactions to develop a predictive model that exploits knowledge of structurally conserved residue-residue interactions in the coiled-coil motif. Our model, which expresses interaction energies as a sum of interpretable residue-pair and triplet terms, achieves a correlation with experimental binding free energies of R = 0.68 and significantly out-performs other scoring functions. To use our model in protein design applications, we devised a strategy in which synthetic peptides are built by assembling 7-residue native-protein heptad modules into new combinations. An integer linear program was used to find the optimal combination of heptads to bind selectively to a target human bZIP coiled coil, but not to target paralogs. Using this approach, we designed peptides to interact with the bZIP domains from human JUN, XBP1, ATF4 and ATF5. Testing more than 132 candidate protein complexes using a fluorescence resonance energy transfer assay confirmed the formation of tight and selective heterodimers between the designed peptides and their targets. This approach can be used to make inhibitors of native proteins, or to develop novel peptides for applications in synthetic biology or nanotechnology.
The three-dimensional structures of proteins are stabilized by the interactions between amino acid residues. Here we report a method where four distances are calculated between any two side chains to provide an exact spatial definition of their bonds. The data were binned into a four-dimensional grid and compared to a random model, from which the preference for specific four-distances was calculated. A clear relation between the quality of the experimental data and the tightness of the distance distribution was observed, with crystal structure data providing far tighter distance distributions than NMR data. Since the four-distance data have higher information content than classical bond descriptions, we were able to identify many unique inter-residue features not found previously in proteins. For example, we found that the side chains of Arg, Glu, Val and Leu are not symmetrical in respect to the interactions of their head groups. The described method may be developed into a function, which computationally models accurately protein structures.
BackgroundAccurate evaluation and modelling of residue-residue interactions within and between proteins is a key aspect of computational structure prediction including homology modelling, protein-protein docking, refinement of low-resolution structures, and computational protein design.ResultsHere we introduce a method for accurate protein structure modelling and evaluation based on a novel 4-distance description of residue-residue interaction geometry. Statistical 4-distance preferences were extracted from high-resolution protein structures and were used as a basis for a knowledge-based potential, called Hunter. We demonstrate that 4-distance description of side chain interactions can be used reliably to discriminate the native structure from a set of decoys. Hunter ranked the native structure as the top one in 217 out of 220 high-resolution decoy sets, in 25 out of 28 "Decoys 'R' Us" decoy sets and in 24 out of 27 high-resolution CASP7/8 decoy sets. The same concept was applied to side chain modelling in protein structures. On a set of very high-resolution protein structures the average RMSD was 1.47 Å for all residues and 0.73 Å for buried residues, which is in the range of attainable accuracy for a model. Finally, we show that Hunter performs as good or better than other top methods in homology modelling based on results from the CASP7 experiment. The supporting web site http://bioinfo.weizmann.ac.il/hunter/ was developed to enable the use of Hunter and for visualization and interactive exploration of 4-distance distributions.ConclusionsOur results suggest that Hunter can be used as a tool for evaluation and for accurate modelling of residue-residue interactions in protein structures. The same methodology is applicable to other areas involving high-resolution modelling of biomolecules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.