The Associative memory, Water mediated, Structure and Energy Model (AWSEM) is a coarse-grained protein force field. AWSEM contains physically motivated terms, such as hydrogen bonding, as well as a bioinformatically based local structure biasing term, which efficiently takes into account many-body effects that are modulated by the local sequence. When combined with appropriate local or global alignments to choose memories, AWSEM can be used to perform de novo protein structure prediction. Herein we present structure prediction results for a particular choice of local sequence alignment method based on short residue sequences called fragments. We demonstrate the model’s structure prediction capabilities for three levels of global homology between the target sequence and those proteins used for local structure biasing, all of which assume that the structure of the target sequence is not known. When there are no homologs in the database of structures used for local structure biasing, AWSEM calculations produce structural predictions that are somewhat improved compared with prior works using related approaches. The inclusion of a small number of structures from homologous sequences improves structure prediction only marginally but when the fragment search is restricted to only homologous sequences, AWSEM can perform high resolution structure prediction and can be used for kinetics and dynamics studies.
The protein frustratometer is an energy landscape theory-inspired algorithm that aims at localizing and quantifying the energetic frustration present in protein molecules. Frustration is a useful concept for analyzing proteins’ biological behavior. It compares the energy distributions of the native state with respect to structural decoys. The network of minimally frustrated interactions encompasses the folding core of the molecule. Sites of high local frustration often correlate with functional regions such as binding sites and regions involved in allosteric transitions. We present here an upgraded version of a webserver that measures local frustration. The new implementation that allows the inclusion of electrostatic energy terms, important to the interactions with nucleic acids, is significantly faster than the previous version enabling the analysis of large macromolecular complexes within a user-friendly interface. The webserver is freely available at URL: http://frustratometer.qb.fcen.uba.ar.
The energy landscape used by nature over evolutionary timescales to select protein sequences is essentially the same as the one that folds these sequences into functioning proteins, sometimes in microseconds. We show that genomic data, physical coarse-grained free energy functions, and family-specific information theoretic models can be combined to give consistent estimates of energy landscape characteristics of natural proteins. One such characteristic is the effective temperature T sel at which these foldable sequences have been selected in sequence space by evolution. T sel quantifies the importance of folded-state energetics and structural specificity for molecular evolution. Across all protein families studied, our estimates for T sel are well below the experimental folding temperatures, indicating that the energy landscapes of natural foldable proteins are strongly funneled toward the native state.energy landscape theory | information theory | selection temperature | funneled landscapes | elastic effects T he physics and natural history of proteins are inextricably intertwined (1, 2). The cooperative manner in which proteins find their way to a folded structure is the result of proteins having undergone natural selection and not typical of random polymers (3, 4). Likewise, the requirement that most proteins must fold to function is a strong constraint on their phylogeny. The unavoidable random mutation events that proteins have undergone throughout their evolution have provided countless numbers of physicochemical experiments on folding landscapes. Thus, the evolutionary patterns of proteins found through comparative sequence analysis can be used to understand protein structure and energetics. In this paper, we compare the information content in the correlated changes that have occurred in protein sequences of common ancestry with energies from a transferable energy function to quantify the influence of maintaining foldability on molecular evolution. Funneled Folding Landscapes from Evolution in Sequence SpaceThe key to our analysis is the principle of minimal frustration (3, 5), which states that, for quick and robust folding, the energy landscape of a protein must be dominated by interactions found in the native conformation. This native conformation is, therefore, separated by an energy gap from other compact structures that otherwise might act as kinetic traps (6, 7). These kinetic traps might appear on the folding landscape during evolution if a random mutation was to stabilize a conformation distinct from the functional one, leading to unviability. In this way, evolution and physical dynamics are coupled. A funneled, minimally frustrated landscape can be achieved if the sequence of the protein evolves to stabilize the native state while not increasing the landscape ruggedness.If folding were the only physicochemical constraint on evolution, the ensemble of naturally observed sequences would correspond to the set of sequences that has a solvent-averaged free energy for the native conformation below a ...
The Nuclear Pore Complex (NPC, ~50 MDa) is the sole passageway for the transport of macromolecules across the nuclear envelope. The NPC plays a key role in numerous critical cellular processes such as transcription, and many of its components are implicated in human diseases such as cancer. Previous work (ref 1, 2) defined the relative positions of its 456 constituent proteins (nucleoporin or Nups), based on spatial restraints derived from biophysical, electron microscopy, and proteomic data. Further elucidation of the evolutionary origin, transport mechanism, and assembly of the NPC will require higher resolution information. As part of an effort to improve upon the resolution and accuracy of the NPC structure, we set out to determine the atomic structures of the NPC components. Because it proved difficult to determine the atomic structures of whole Nups by X-ray crystallography alone, we are relying on multiple datasets that are combined computationally by our Integrative Modeling Platform (IMP) package (http://salilab.org/imp). In particular, we developed an integrative modeling approach that benefits from crystallographic structures of fragments of the protein or its homologs, Solution Small Angle X-ray Scattering (SAXS) profiles of the protein and its fragments (ref 3), NMR, and negative stain Electron Microscopy (EM) micrographs of the protein. Each dataset is converted into a set of spatial restraints on the protein structure, followed by finding a model that satisfies the restraints as well as possible using a Monte Carlo / molecular dynamics optimization procedure. The approach will be illustrated by its application to yeast Nup133.
We investigate protein-protein association using the associativememory, water-mediated, structure, and energy model (AWSEM), a coarse-grained protein folding model that has been optimized using energy-landscape theory. The potential was originally parameterized by enforcing a funneled nature for a database of dimeric interfaces but was later further optimized to create funneled folding landscapes for individual monomeric proteins. The ability of the model to predict interfaces was not tested previously. The present results show that simulated annealing of the model indeed is able to predict successfully the native interfaces of eight homodimers and four heterodimers, thus amounting to a flexible docking algorithm. We go on to address the relative importance of monomer geometry, flexibility, and nonnative intermonomeric contacts in the association process for the homodimers. Monomer surface geometry is found to be important in determining the binding interface, but it is insufficient. Using a uniform binding potential rather than the water-mediated potential results in sampling of misbound structures that are geometrically preferred but are nonetheless energetically disfavored by AWSEM, as well as in nature. Depending on the stability of the unbound monomers, nonnative contacts play different roles in the association process. For unstable monomers, thermodynamic states stabilized by nonnative interactions correspond to productive, on-pathway intermediates and can, therefore, catalyze binding through a fly-casting mechanism. For stable monomers, in contrast, states stabilized by nonnative interactions generally correspond to traps that impede binding.binding interface prediction | swapped contacts P rotein-protein interfaces encode information that is key to a molecular understanding of biological functions. The folding of proteins is well understood in the framework of energy landscape theory and its principle of minimal frustration. Are binding landscapes also funneled? Mechanistic consequences of funneled binding landscapes have been investigated using structure-based models (1-5). The agreement of these mechanisms with observation suggests that binding landscapes are generally funneled, explaining why topology is indeed a major factor in determining binding mechanisms (1). A statistical analysis of a large database of protein complexes revealed that for many of the complexes, the binding energy gap is indeed larger than expected knowing the variance of the binding energy (6), the hallmark feature of a funneled landscape (7). When further testing this idea, Papoian et al. discovered that for other complexes, to have a funneled landscape for binding, unanticipated water-mediated interactions were required. They developed a water-mediated potential encoding these interactions (8). This transferable potential was later optimized to create funneled folding landscapes that successfully predict the structure of monomeric proteins (9, 10). Therefore, there is considerable support for the idea that, like folding landsc...
Frustration from strong interdomain interactions can make misfolding a more severe problem in multidomain proteins than in singledomain proteins. On the basis of bioinformatic surveys, it has been suggested that lowering the sequence identity between neighboring domains is one of nature's solutions to the multidomain misfolding problem. We investigate folding of multidomain proteins using the associative-memory, water-mediated, structure and energy model (AWSEM), a predictive coarse-grained protein force field. We find that reducing sequence identity not only decreases the formation of domain-swapped contacts but also decreases the formation of strong self-recognition contacts between β-strands with high hydrophobic content. The ensembles of misfolded structures that result from forming these amyloid-like interactions are energetically disfavored compared with the native state, but entropically favored. Therefore, these ensembles are more stable than the native ensemble under denaturing conditions, such as high temperature. Domainswapped contacts compete with self-recognition contacts in forming various trapped states, and point mutations can shift the balance between the two types of interaction. We predict that multidomain proteins that lack these specific strong interdomain interactions should fold reliably.aggregation | funnel P rotein misfolding and productive protein folding bear a yinyang relationship in the energy landscape theory of biomolecular self-organization (1). Only by comparing the strengths of the forces leading to proper structure to those that might, by chance, stabilize alternative structure can we quantitatively understand how proteins kinetically access their thermodynamically stable ordered states (1). In vivo and at low concentrations in vitro, unfolded small proteins avoid kinetic traps and generally find their way easily to their native state. Nevertheless, diseases caused by the misfolding of several specific proteins plague mankind (2, 3). Despite much effort, the patterns of interactions that allow pathological misfolding remain incompletely understood. Known pathological misfolding entails aggregation of specific proteins and thus the interactions of protein molecules with other copies of themselves. Energy landscape theory provides one natural explanation of this specificity in misfolding through the funneled nature of the monomeric protein energy landscape: Native-like interactions between different protein molecules like those found within a single protein are stronger than alternate nonnative interactions in the same molecule or interactions between peptide sequences chosen at random in the two molecules. Because of this intrinsic self-stickiness of foldable molecules, runaway domain swapping, in which native-like interactions are made between different copies of the same protein, provides a natural mechanism for aggregation (4-7). Indeed, transient protein aggregation during refolding at moderately high concentration does appear to be universal (8). Nevertheless this aggregati...
Despite the ubiquity of helical membrane proteins in nature and their pharmacological importance, the mechanisms guiding their folding remain unclear. We performed kinetic folding and unfolding experiments on 69 mutants (engineered every 2-3 residues throughout the 178-residue transmembrane domain) of GlpG, a membraneembedded rhomboid protease from Escherichia coli. The only clustering of significantly positive ϕ-values occurs at the cytosolic termini of transmembrane helices 1 and 2, which we identify as a compact nucleus. The three loops flanking these helices show a preponderance of negative ϕ-values, which are sometimes taken to be indicative of nonnative interactions in the transition state. Mutations in transmembrane helices 3-6 yielded predominantly ϕ-values near zero, indicating that this part of the protein has denaturedstate-level structure in the transition state. We propose that loops 1-3 undergo conformational rearrangements to position the folding nucleus correctly, which then drives folding of the rest of the domain. A compact N-terminal nucleus is consistent with the vectorial nature of cotranslational membrane insertion found in vivo. The origin of the interactions in the transition state that lead to a large number of negative ϕ-values remains to be elucidated.GlpG | membrane protein | rhomboid | folding | kinetics T he biologically active structure of a protein is encoded in its sequence, and protein-folding studies aim to elucidate how this native state is reached. Great progress has been made in understanding the mechanisms of folding of water-soluble proteins based on comprehensive protein-engineering studies in combination with computational efforts (1, 2) and application of theoretical models (3-5). Much less is known about the folding mechanisms of membrane proteins that present extra challenges such as low expression levels and the need for a membrane-like environment to fold (6-11). In vivo, α-helical membrane proteins insert into the membrane cotranslationally via the signal recognition particle and Sec-translocon complex (12). Transmembrane helices exit one by one or in pairs into the lipid environment through a lateral gate in the translocon. Folding to the native state occurs spontaneously after helices are inserted into the membrane. To mimic this process, most in vitro membrane protein-folding experiments first denature the protein in SDS; renaturation is then achieved by adding excess nonionic surfactants such as dodecyl maltoside (DDM) (13).A complete protein folding mechanism must include descriptions of the denatured state (D), the native state (N), any metastable intermediates, and the transiently populated transition states (TS) that connect them. TS can only be analyzed indirectly using methods based on kinetic experiments, such as Fersht's ϕ-value approach (14, 15). The ϕ-value is the ratio between the energy perturbation to N (from equilibrium measurements or a combination of folding and unfolding kinetics) and the energy perturbation to TS (from kinetic measurements) ca...
While being long in range and therefore weakly specific, electrostatic interactions are able to modulate the stability and folding landscapes of some proteins. The relevance of electrostatic forces for steering the docking of proteins to each other is widely acknowledged, however, the role of electrostatics in establishing specifically funneled landscapes and their relevance for protein structure prediction are still not clear. By introducing Debye-H€ uckel potentials that mimic long-range electrostatic forces into the Associative memory, Water mediated, Structure, and Energy Model (AWSEM), a transferable protein model capable of predicting tertiary structures, we assess the effects of electrostatics on the landscapes of thirteen monomeric proteins and four dimers. For the monomers, we find that adding electrostatic interactions does not improve structure prediction. Simulations of ribosomal protein S6 show, however, that folding stability depends monotonically on electrostatic strength. The trend in predicted melting temperatures of the S6 variants agrees with experimental observations. Electrostatic effects can play a range of roles in binding. The binding of the protein complex KIX-pKID is largely assisted by electrostatic interactions, which provide direct charge-charge stabilization of the native state and contribute to the funneling of the binding landscape. In contrast, for several other proteins, including the DNA-binding protein FIS, electrostatics causes frustration in the DNA-binding region, which favors its binding with DNA but not with its protein partner. This study highlights the importance of long-range electrostatics in functional responses to problems where proteins interact with their charged partners, such as DNA, RNA, as well as membranes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.