R on ne be rg er , K a t hr yn T un ya su vu na ko ol,
This is a PDF file of a peer-reviewed paper that has been accepted for publication. Although unedited, the content has been subjected to preliminary formatting. Nature is providing this early version of the typeset paper as a service to our authors and readers. The text and figures will undergo copyediting and a proof review before the paper is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.
No abstract
We describe the operation and improvement of AlphaFold, the system that was entered by the team AlphaFold2 to the "human" category in the 14th Critical Assessment of Protein Structure Prediction (CASP14). The AlphaFold system entered in CASP14 is entirely different to the one entered in CASP13. It used a novel end-toend deep neural network trained to produce protein structures from amino acid sequence, multiple sequence alignments, and homologous proteins. In the assessors' ranking by summed z scores (>2.0), AlphaFold scored 244.0 compared to 90.8 by the next best group. The predictions made by AlphaFold had a median domain GDT_TS of 92.4; this is the first time that this level of average accuracy has been achieved during CASP, especially on the more difficult Free Modeling targets, and represents a significant improvement in the state of the art in protein structure prediction. We reported how AlphaFold was run as a human team during CASP14 and improved such that it now achieves an equivalent level of performance without intervention, opening the door to highly accurate large-scale structure prediction.
Analysis of an intrinsically disordered protein (IDP) reveals an underlying multifunnel structure for the energy landscape. We suggest that such ‘intrinsically disordered’ landscapes, with a number of very different competing low-energy structures, are likely to characterise IDPs, and provide a useful way to address their properties. In particular, IDPs are present in many cellular protein interaction networks, and several questions arise regarding how they bind to partners. Are conformations resembling the bound structure selected for binding, or does further folding occur on binding the partner in a induced-fit fashion? We focus on the p53 upregulated modulator of apoptosis (PUMA) protein, which adopts an -helical conformation when bound to its partner, and is involved in the activation of apoptosis. Recent experimental evidence shows that folding is not necessary for binding, and supports an induced-fit mechanism. Using a variety of computational approaches we deduce the molecular mechanism behind the instability of the PUMA peptide as a helix in isolation. We find significant barriers between partially folded states and the helix. Our results show that the favoured conformations are molten-globule like, stabilised by charged and hydrophobic contacts, with structures resembling the bound state relatively unpopulated in equilibrium.
We investigate the solvent effects leading to dissociation of sodium chloride in water. Thermodynamic analysis reveals dissociation to be driven energetically and opposed entropically, with the loss in entropy due to an increasing number of solvent molecules entering the highly coordinated ionic solvation shell. We show through committor analysis that the ion–ion distance is an insufficient reaction coordinate, in agreement with previous findings. By application of committor analysis on various constrained solvent ensembles, we find that the dissociation event is generally sensitive to solvent fluctuations at long ranges, with both sterics and electrostatics of importance. The dynamics of the reaction reveal that solvent rearrangements leading to dissociation occur on time scales from 0.5 to 5 ps or longer, and that, near the transition state, inertial effects enhance the reaction probability of a given trajectory.
Free energy perturbation (FEP) was proposed by Zwanzig [J. Chem. Phys. 22, 1420 (1954)] more than six decades ago as a method to estimate free energy differences and has since inspired a huge body of related methods that use it as an integral building block. Being an importance sampling based estimator, however, FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions. One strategy to mitigate this problem, called Targeted FEP, uses a high-dimensional mapping in configuration space to increase the overlap of the underlying distributions. Despite its potential, this method has attracted only limited attention due to the formidable challenge of formulating a tractable mapping. Here, we cast Targeted FEP as a machine learning problem in which the mapping is parameterized as a neural network that is optimized so as to increase the overlap. We develop a new model architecture that respects permutational and periodic symmetries often encountered in atomistic simulations and test our method on a fully periodic solvation system. We demonstrate that our method leads to a substantial variance reduction in free energy estimates when compared against baselines, without requiring any additional data.
Machine learning techniques are being increasingly used as flexible non-linear fitting and prediction tools in the physical sciences. Fitting functions that exhibit multiple solutions as local minima can be analysed in terms of the corresponding machine learning landscape. Methods to explore and visualise molecular potential energy landscapes can be applied to these machine learning landscapes to gain new insight into the solution space involved in training and the nature of the corresponding predictions. In particular, we can define quantities analogous to molecular structure, thermodynamics, and kinetics, and relate these emergent properties to the structure of the underlying landscape. This Perspective aims to describe these analogies with examples from recent applications, and suggest avenues for new interdisciplinary research.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.