Intrinsically disordered proteins (iDPs) lack well-defined three-dimensional structures, thus challenging the archetypal notion of structure—function relationships. Determining the ensemble of conformations that IDPs explore under physiological conditions is the first step toward understanding their diverse cellular functions. Here, we quantitatively characterize the structural features of IDPs as a function of sequence and length using coarse-grained simulations. For diverse IDP sequences, with the number of residues (NT) ranging from 20 to 441, our simulations not only reproduce the radii of gyration (Rg) obtained from experiments, but also predict the full scattering intensity profiles in excellent agreement with small-angle X-ray scattering experiments. The Rg values are well-described by the standard Flory scaling law, Rg=Rg0NTν,withν≈0.588, making it tempting to assert that IDPs behave as polymers in a good solvent. However, clustering analysis reveals that the menagerie of structures explored by IDPs is diverse, with the extent of heterogeneity being highly sequence-dependent, even though ensemble-averaged properties, such as the dependence of Rg on chain length, may suggest synthetic polymer-like behavior in a good solvent. For example, we show that for the highly charged Prothymosin-α, a substantial fraction of conformations is highly compact. Even if the sequence compositions are similar, as is the case for α-Synuclein and a truncated construct from the Tau protein, there are substantial differences in the conformational heterogeneity. Taken together, these observations imply that metrics based on net charge or related quantities alone cannot be used to anticipate the phases of IDPs, either in isolation or in complex with partner IDPs or RNA. Our work sets the stage for probing the interactions of IDPs with each other, with folded protein domains, or with partner RNAs, which are critical for describing the structures of stress granules and biomolecular condensates with important cellular functions.
Intrinsically disordered proteins (IDPs) lack well-defined three-dimensional structures, thus challenging the archetypal notion of structure-function relationships. Determining the ensemble of conformations that IDPs explore under physiological conditions is the first step towards understanding their diverse cellular functions. Here, we quantitatively characterize the structural features of IDPs as a function of sequence and length using coarse-grained simulations. For diverse IDP sequences, with the number of residues (N T ) ranging from 24 to 441, our simulations not only reproduce the radii of gyration (R g ) obtained from experiments, but also predict the full scattering intensity profiles in very good agreement with Small Angle X-ray Scattering experiments.The R g values are well-described by the standard Flory scaling law,with ν ≈ 0.588, making it tempting to assert that IDPs behave as polymers in a good solvent. However, clustering analysis reveals that the menagerie of structures explored by IDPs is diverse, with the extent of heterogeneity being highly sequence-dependent, even though ensemble-averaged properties, such as the dependence of R g on chain length, may suggest synthetic polymer-like behavior in a good solvent. For example, we show that for the highly charged Prothymosin-α, a substantial fraction of conformations is highly compact. Even if the sequence compositions are similar, as is the case for α-Synuclein and a truncated construct from the Tau protein, there are substantial differences in the conformational heterogeneity. Taken together, these observations imply that metrics based on net charge or related quantities alone, cannot be used to anticipate the phases of IDPs, either in isolation or in complex with partner IDPs or RNA. Our work sets the stage for probing the interactions of IDPs with each other, with folded protein domains, or with partner RNAs, which are critical for describing the structures of stress granules and biomolecular condensates with important cellular functions.
Many biological functions are executed by molecular machines, which like man made motors consume energy and convert it into mechanical work. Biological machines have evolved to transport cargo, facilitate folding of proteins and RNA, remodel chromatin and replicate DNA. A common aspect of these machines is that their functions are driven by fuel provided by hydrolysis of ATP or GTP, thus driving them out of equilibrium. It is a challenge to provide a general framework for understanding the functions of biological machines, such as molecular motors (kinesin, dynein, and myosin), molecular chaperones, and helicases. Using these machines, whose structures have little resemblance to one another, as prototypical examples, we describe a few general theoretical methods that have provided insights into their functions. Although the theories rely on coarse-graining of these complex systems they have proven useful in not only accounting for many in vitro experiments but also address questions such as how the trade-off between precision, energetic costs and optimal performances are balanced. However, many complexities associated with biological machines will require one to go beyond current theoretical methods. We point out that simple point mutations in the enzyme could drastically alter functions, making the motors bi-directional or result in unexpected diseases or dramatically restrict the capacity of molecular chaperones to help proteins fold. These examples are reminders that while the search for principles of generality in biology is intellectually stimulating, one also ought to keep in mind that molecular details must be accounted for to develop a deeper understanding of processes driven by biological machines. Going beyond generic descriptions of in vitro behavior to making genuine understanding of in vivo functions will likely remain a major challenge for some time to come. In this context, the combination of careful experiments and the use of physics and physical chemistry principles will be useful in elucidating the rules governing the workings of biological machines.
Residues spanning distinct regions of the low-complexity domain of the RNA-binding protein, Fused in Sarcoma (FUS-LC), form fibril structures with different core morphologies. Solid-state NMR experiments show that the 214-residue FUS-LC forms a fibril with an S-bend (core-1, residues 39–95), while the rest of the protein is disordered. In contrast, the fibrils of the C-terminal variant (FUS-LC-C; residues 111–214) have a U-bend topology (core-2, residues 112–150). Absence of the U-bend in FUS-LC implies that the two fibril cores do not coexist. Computer simulations show that these perplexing findings could be understood in terms of the population of sparsely populated fibril-like excited states in the monomer. The propensity to form core-1 is higher compared to core-2. We predict that core-2 forms only in truncated variants that do not contain the core-1 sequence. At the monomer level, sequence-dependent enthalpic effects determine the relative stabilities of the core-1 and core-2 topologies.
A technology for optimization of potential parameters from condensed-phase simulations (POP) is discussed and illustrated. It is based on direct calculations of the derivatives of macroscopic observables with respect to the potential parameters. The derivatives are used in a local minimization scheme, comparing simulated and experimental data. In particular, we show that the Newton trust region protocol allows for more accurate and robust optimization. We apply the newly developed technology to study the liquid mixture of tert-butanol and water. We are able to obtain, after four iterations, the correct phase behavior and accurately predict the value of the Kirkwood Buff (KB) integrals. We further illustrate that a potential that is determined solely by KB information, or the pair correlation function, is not necessarily unique.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.