Understanding the conformational characteristics of protein complexes in solution is crucial for a deeper insight in their biological function. Molecular dynamics simulations performed on high performance computing plants and with modern simulation techniques can be used to obtain large data sets that contain conformational and thermodynamic information about biomolecular systems. While this can in principle give a detailed picture of protein–protein interactions in solution and therefore complement experimental data, it also raises the challenge of processing exceedingly large high-dimensional data sets with several million samples. Here we present a novel method for the characterization of protein–protein interactions, which combines a neural network based dimensionality reduction technique to obtain a two-dimensional representation of the conformational space with a density based clustering algorithm for state detection and a metric which assesses the (dis)similarity between different conformational spaces. This method is highly scalable and therefore makes the analysis of massive data sets computationally tractable. We demonstrate the power of this approach to large scale data analysis by characterizing highly dynamic conformational phase spaces of differently linked ubiquitin (Ub) oligomers from coarse-grained simulations. We are able to extract a protein–protein interaction model for two unlinked Ub proteins which is then used to determine how the Ub–Ub interaction pattern is altered in Ub oligomers by the introduction of a covalent linkage. We find that the Ub chain conformational ensemble depends highly on the linkage type and for some cases also on the Ub chain length. By this, we obtain insight into the conformational characteristics of different Ub chains and how this may contribute to linkage type and chain length specific recognition.
The structural basis for the stability of the trimeric form of the light harvesting complex (LHCII), a pigmented protein from green plants pivotal for photosynthesis, remains elusive till date. The protein embedded in a dipalmitoylphosphatidylcholine (DPPC) lipid membrane is investigated using all-atom molecular dynamics simulations to find out the interactions responsible for the structural integrity of the trimer and its relation to antenna function. Central association of chlorophyll a (CLA) molecules near the LHCII chains is attributed to a conserved coordination between the Mg of CLA and the oxygen of a specific residue of the first helix of a chain. The residue forms a salt-bridge with the fourth helix of the same chain of the trimer, not of the monomer. In an earlier experiment, three residues (WYR) at each chain of the trimer have been found indispensable for the trimerization and referred to as trimerization motif. We find that the residues of the trimerization motif are connected to the lipids or pigments by a chain of interactions rather than a direct contact. Synergistic effects of sequentially located hydrogen bonds and salt-bridges within monomers of the trimer keep the trimer conformation stable in association with the pigments or the lipids. These interactions are exclusively present in the pigmented trimer and not present in the monomer or in the unpigmented trimer. Thus, our results provide a molecular basis for the inherent stability of the LHCII trimer in a lipid membrane and explain many pre-existing experimental data.
Characterizing the structural dynamics of proteins with heterogeneous conformational landscapes is crucial to understanding complex biomolecular processes. To this end, dimensionality reduction algorithms are used to produce lowdimensional embeddings of the high-dimensional conformational phase space. However, identifying a compact and informative set of input features for the embedding remains an ongoing challenge. Here, we propose to harness the power of Residue Interaction Networks (RINs) and their centrality measures, established tools to provide a graph theoretical view on molecular structure. Specifically, we combine the closeness centrality, which captures global features of the protein conformation at residue-wise resolution, with EncoderMap, a hybrid neural-network autoencoder/multidimensional-scaling like dimensionality reduction algorithm. We find that the resulting low-dimensional embedding is a meaningful visualization of the residue interaction landscape that resolves structural details of the protein behavior while retaining global interpretability. This feature-based graph embedding of temporal protein graphs makes it possible to apply the general descriptive power of RIN formalisms to the analysis of protein simulations of complex processes such as protein folding and multidomain interactions requiring no protein-specific input. We demonstrate this on simulations of the fast folding protein Trp-Cage and the multidomain signaling protein FAT10. Due to its generality and modularity, the presented approach can easily be transferred to other protein systems.
Characterizing the structural dynamics of proteins with heterogeneous conformational landscapes is crucial to understanding complex biomolecular processes. To this end, dimensionality reduction algorithms are used to produce low-dimensional embeddings of the high-dimensional conformational phase space. However, identifying a compact and informative set of input features for the embedding remains an ongoing challenge. Here, we propose to harness the power of Residue Interaction Networks (RINs) and their centrality measures, established tools to provide a graph theoretical view on molecular structure. Specifically, we combine the closeness centrality, which captures global features of the protein conformation at residue-wise resolution, with EncoderMap, a hybrid neural-network autoencoder/multidimensional-scaling like dimensionality reduction algorithm. We find that the resulting low-dimensional embedding is a meaningful visualization of the residue interaction landscape that resolves structural details of the protein behavior while retaining global interpretability. This feature-based graph embedding of temporal protein graphs makes it possible to apply the general descriptive power of RIN formalisms to the analysis of protein simulations of complex processes such as protein folding and multi-domain interactions requiring no protein-specific input. We demonstrate this on simulations of the fast folding protein Trp-Cage and the multi-domain signalling protein FAT10. Due to its generality and modularity, the presented approach can easily be transferred to other protein systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.