Structure determination of linear peptides of 5–50 amino acids in aqueous solution and interacting with proteins is a key aspect in structural biology. PEP-FOLD3 is a novel computational framework, that allows both (i) de novo free or biased prediction for linear peptides between 5 and 50 amino acids, and (ii) the generation of native-like conformations of peptides interacting with a protein when the interaction site is known in advance. PEP-FOLD3 is fast, and usually returns solutions in a few minutes. Testing PEP-FOLD3 on 56 peptides in aqueous solution led to experimental-like conformations for 80% of the targets. Using a benchmark of 61 peptide–protein targets starting from the unbound form of the protein receptor, PEP-FOLD3 was able to generate peptide poses deviating on average by 3.3Å from the experimental conformation and return a native-like pose in the first 10 clusters for 52% of the targets. PEP-FOLD3 is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/PEP-FOLD3.
Hydrophobic cluster analysis (HCA) is an original approach for protein sequence analysis, which provides access to the foldable repertoire of the protein universe, including yet unannotated protein segments ("dark proteome"). Foldable segments correspond to ordered regions, as well as to intrinsically disordered regions (IDRs) undergoing disorder to order transitions. In this review, how HCA can be used to give insight into this last category of foldable segments is illustrated, with examples matching known 3D structures. After reviewing the HCA principles, examples of short foldable segments are given, which often contain short linear motifs, typically matching hydrophobic clusters. These segments become ordered upon contact with partners, with secondary structure preferences generally corresponding to those observed in the 3D structures within the complexes. Such small foldable segments are sometimes larger than the segments of known 3D structures, including flanking hydrophobic clusters that may be critical for interaction specificity or regulation, as well as intervening sequences allowing fuzziness. Cases of larger conditionally disordered domains are also presented, with lower density in hydrophobic clusters than well-folded globular domains or with exposed hydrophobic patches, which are stabilized by interaction with partners.
We present a new approach for the prediction of the coarse-grain 3D structure of RNA molecules. We model a molecule as being made of helices and junctions. Those junctions are classified into topological families that determine their preferred 3D shapes. All the parts of the molecule are then allowed to establish long-distance contacts that induce a 3D folding of the molecule. An algorithm relying on game theory is proposed to discover such long-distance contacts that allow the molecule to reach a Nash equilibrium. As reported by our experiments, this approach allows one to predict the global shape of large molecules of several hundreds of nucleotides that are out of reach of the state-of-the-art methods.
Hydrophobic clusters, as defined by Hydrophobic Cluster Analysis (HCA), are conditioned binary patterns, made of hydrophobic and non-hydrophobic positions, whose limits fit well those of regular secondary structures. They were proved to be useful for predicting secondary structures in proteins from the only information of a single amino acid sequence and have permitted to assess, in a comprehensive way, the leading role of binary patterns in RSS preference towards a particular state. Here, we considered the available experimental 3D structures of protein globular domains to enlarge our previously reported hydrophobic cluster database (HCDB), almost doubling the number of hydrophobic cluster species (each species being defined by a unique binary pattern) that represent the most frequent structural bricks encountered within protein globular domains. We then used this updated HCDB to show that the hydrophobic amino acids of discordant clusters, i.e. those less abundant clusters for which the observed secondary structure is in disagreement with the binary pattern preference of the species to which they belong, are more exposed to solvent and are more involved in protein interfaces than the hydrophobic amino acids of concordant clusters. As amino acid composition differs between concordant/discordant clusters, considering binary patterns may be used to gain novel insights into key features of protein globular domain cores and surfaces. It can also provide useful information on possible conformational plasticity, including disorder to order transitions.
Hidden Markov Model derived structural alphabets are a probabilistic framework in which the complete conformational space of a peptidic chain is described in terms of probability distributions that can be sampled to identify conformations of largest probabilities. Here, we assess how three strategies to sample sub‐optimal conformations—Viterbi k‐best, forward backtrack and a taboo sampling approach—can lead to the efficient generation of peptide conformations. We show that the diversity of sampling is essential to compensate biases introduced in the estimates of the probabilities, and we find that only the forward backtrack and a taboo sampling strategies can efficiently generate native or near‐native models. Finally, we also find such approaches are as efficient as former protocols, while being one order of magnitude faster, opening the door to the large scale de novo modeling of peptides and mini‐proteins. © 2016 Wiley Periodicals, Inc.
Sir4 is a core component of heterochromatin found in yeasts of the Saccharomycetaceae family, whose general hallmark is to harbor a three-loci mating-type system with two silent loci. However, a large part of the Sir4 amino acid sequences has remained unexplored, belonging to the dark proteome. Here, we analyzed the phylogenetic profile of yet undescribed foldable regions present in Sir4 as well as in Esc1, an Sir4-interacting perinuclear anchoring protein. Within Sir4, we identified a new conserved motif (TOC) adjacent to the N-terminal KU-binding motif. We also found that the Esc1-interacting region of Sir4 is a Dbf4-related H-BRCT domain, only present in species possessing the HO endonuclease and in Kluveryomyces lactis . In addition, we found new motifs within Esc1 including a motif (Esc1-F) that is unique to species where Sir4 possesses an H-BRCT domain. Mutagenesis of conserved amino acids of the Sir4 H-BRCT domain, known to play a critical role in the Dbf4 function, shows that the function of this domain is separable from the essential role of Sir4 in transcriptional silencing and the protection from HO-induced cutting in Saccharomyces cerevisiae . In the more distant methylotrophic clade of yeasts, which often harbor a two-loci mating-type system with one silent locus, we also found a yet undescribed H-BRCT domain in a distinct protein, the ISWI2 chromatin-remodeling factor subunit Itc1. This study provides new insights on yeast heterochromatin evolution and emphasizes the interest of using sensitive methods of sequence analysis for identifying hitherto ignored functional regions within the dark proteome.
The blind identification of candidate patches of interaction on the protein surface is a difficult task that can hardly be accomplished without a heuristic or the use of simplified representations to speed up the search. The PEP-SiteFinder protocol performs a systematic blind search on the protein surface using a rigid docking procedure applied to a limited set of peptide suboptimal conformations expected to approximate satisfactorily the conformation of the peptide in interaction. All steps rely on a coarse-grained representation of the protein and the peptide. While simple, such a protocol can help to infer useful information, assuming a critical analysis of the results. Moreover, such a protocol can be extended to a semi-flexible protocol where the suboptimal conformations are directly folded in the vicinity of the receptor.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.