Ensemble-based models of protein structure and dynamics reflecting experimental parameters are increasingly used to obtain deeper understanding of the role of dynamics in protein function. Such ensembles differ substantially from those routinely deposited in the PDB and, consequently, require specialized validation and analysis methodology. Here we describe our completely rewritten online validation tool, CoNSEnsX, that offers a standardized way to assess the correspondence of such ensembles to experimental NMR parameters. The server provides a novel selection feature allowing a user-selectable set and weights of different parameters to be considered. This also offers an approximation of potential overfitting, namely, whether the number of conformers necessary to reflect experimental parameters can be reduced in the ensemble provided. The CoNSEnsX webserver is available at consensx.itk.ppke.hu . The corresponding Python source code is freely available on GitHub ( github.com/PPKE-Bioinf/consensx.itk.ppke.hu ).
Charged single alpha-helices (CSAHs) constitute a rare structural motif. CSAH is characterized by a high density of regularly alternating residues with positively and negatively charged side chains. Such segments exhibit unique structural properties; however, there are only a handful of proteins where its existence is experimentally verified. Therefore, establishing a pipeline that is capable of predicting the presence of CSAH segments with a low false positive rate is of considerable importance. Here we describe a consensus-based approach that relies on two conceptually different CSAH detection methods and a final filter based on the estimated helix-forming capabilities of the segments. This pipeline was shown to be capable of identifying previously uncharacterized CSAH segments that could be verified experimentally. The method is available as a web server at http://csahserver.itk.ppke.hu and also a downloadable standalone program suitable to scan larger sequence collections.
Single alpha-helices (SAHs) are increasingly recognized as important structural and functional elements of proteins. Comprehensive identification of SAH segments in large protein datasets was largely hindered by the slow speed of the most restrictive prediction tool for their identification, FT_CHARGE on common hardware. We have previously implemented an FPGA-based version of this tool allowing fast analysis of a large number of sequences. Using this implementation, we have set up of a semi-automated pipeline capable of analyzing full UniProt releases in reasonable time and compiling monthly updates of a comprehensive database of SAH segments. Releases of this database, denoted CSAHDB, is available on the CSAHserver 2 website at csahserver.itk.ppke.hu. An overview of human SAH-containing sequences combined with a literature survey suggests specific roles of SAH segments in proteins involved in RNA-based regulation processes as well as cytoskeletal proteins, a number of which is also linked to the development and function of synapses.
PDZ domains are abundant interaction hubs found in a number of different proteins and they exhibit characteristic differences in their structure and ligand specificity. Their internal dynamics have been proposed to contribute to their biological activity via changes in conformational entropy upon ligand binding and allosteric modulation. Here we investigate dynamic structural ensembles of PDZ3 of the postsynaptic protein PSD-95, calculated based on previously published backbone and side-chain S2 order parameters. We show that there are distinct but interdependent structural rearrangements in PDZ3 upon ligand binding and the presence of the intramolecular allosteric modulator helix α3. We have also compared these rearrangements in PDZ1-2 of PSD-95 and the conformational diversity of an extended set of PDZ domains available in the PDB database. We conclude that although the opening-closing rearrangement, occurring upon ligand binding, is likely a general feature for all PDZ domains, the conformer redistribution upon ligand binding along this mode is domain-dependent. Our findings suggest that the structural and functional diversity of PDZ domains is accompanied by a diversity of internal motional modes and their interdependence.
Ensemble-based structural modeling of flexible protein segments such as intrinsically disordered regions is a complex task often solved by selection of conformers from an initial pool based on their conformity to experimental data. However, the properties of the conformational pool are crucial, as the sampling of the conformational space should be sufficient and, in the optimal case, relatively uniform. In other words, the ideal sampling is both efficient and exhaustive. To achieve this, specialized tools are usually necessary, which might not be maintained in the long term, available on all platforms or flexible enough to be tweaked to individual needs. Here, we present an open-source and extendable pipeline to generate initial protein structure pools for use with selection-based tools to obtain ensemble models of flexible protein segments. Our method is implemented in Python and uses ChimeraX, Scwrl4, Gromacs and neighbor-dependent backbone distributions compiled and published previously by the Dunbrack lab. All these tools and data are publicly available and maintained. Our basic premise is that by using residue-specific, neighbor-dependent Ramachandran distributions, we can enhance the efficient exploration of the relevant region of the conformational space. We have also provided a straightforward way to bias the sampling towards specific conformations for selected residues by combining different conformational distributions. This allows the consideration of a priori known conformational preferences such as in the case of preformed structural elements. The open-source and modular nature of the pipeline allows easy adaptation for specific problems. We tested the pipeline on an intrinsically disordered segment of the protein Cd3ϵ and also a single-alpha helical (SAH) region by generating conformational pools and selecting ensembles matching experimental data using the CoNSEnsX+ server.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.