Peptides mediate up to 40% of protein interactions, their high specificity and ability to bind in places where small molecules cannot make them potential drug candidates. However, predicting peptide-protein complexes remains more challenging than protein-protein or protein-small molecule interactions, in part due to the high flexibility peptides have. In this review we look at the advances in docking, molecular simulations, and machine learning to tackle problems related to peptides such as predicting structures, binding affinities or even kinetics. We specifically focus on explaining the number of docking programs and force fields used in molecular simulations, so a prospective user can have an educated guess as to why choose one modeling tool or another to address their scientific questions.
Intrinsically disordered regions of proteins often mediate important protein-protein interactions. However, the folding upon binding nature of many polypeptide-protein interactions limits the ability of modeling tools to predict structures of such complexes. To address this problem, we have taken a tandem approach combining NMR chemical shift data and molecular simulations to determine structures of peptide-protein complexes. Here, we demonstrate this approach for polypeptide com-plexes formed with the extraterminal (ET) domain of bromo and extraterminal domain (BET) proteins, which exhibit a high degree of binding plasticity. This system is particularly challenging as the binding process includes allosteric changes across the ET receptor upon binding, and the polypeptide binding partners can form different conformations (e.g., helices and hair-pins) in the complex. In a blind study, the new approach successfully modeled bound-state conformations and binding pos-es, using only backbone chemical shift data, in excellent agreement with experimentally-determined structures. The approach also predicts relative binding affinities of different peptides. This hybrid MELD-NMR approach provides a powerful new tool for structural analysis of protein-polypeptide complexes in the low NMR information content regime, which can be used successfully for flexible systems where one polypeptide binding partner folds upon complex formation.
Intrinsically disordered regions of proteins often mediate important protein−protein interactions. However, the folding-upon-binding nature of many polypeptide−protein interactions limits the ability of modeling tools to predict the three-dimensional structures of such complexes. To address this problem, we have taken a tandem approach combining NMR chemical shift data and molecular simulations to determine the structures of peptide−protein complexes. Here, we use the MELD (Modeling Employing Limited Data) technique applied to polypeptide complexes formed with the extraterminal domain (ET) of bromo and extraterminal domain (BET) proteins, which exhibit a high degree of binding plasticity. This system is particularly challenging as the binding process includes allosteric changes across the ET receptor upon binding, and the polypeptide binding partners can adopt different conformations (e.g., helices and hairpins) in the complex. In a blind study, the new approach successfully modeled bound-state conformations and binding poses, using only protein receptor backbone chemical shift data, in excellent agreement with experimentally determined structures for moderately tight (K d ∼100 nM) binders. The hybrid MELD + NMR approach required additional peptide ligand chemical shift data for weaker (K d ∼250 μM) peptide binding partners. AlphaFold also successfully predicts the structures of some of these peptide−protein complexes. However, whereas AlphaFold can provide qualitative peptide rankings, MELD can directly estimate relative binding affinities. The hybrid MELD + NMR approach offers a powerful new tool for structural analysis of protein−polypeptide complexes involving disorder-toorder transitions upon complex formation, which are not successfully modeled with most other complex prediction methods, providing both the 3D structures of peptide−protein complexes and their relative binding affinities.
Sparsely labeled NMR samples provide opportunities to study larger biomolecular assemblies than is traditionally done by NMR. This requires new computational tools that can handle the sparsity and ambiguity in the NMR datasets. The MELD (modeling employing limited data) Bayesian approach was assessed to be the best performing in predicting structures from sparsely labeled NMR data in the 13th edition of the Critical Assessment of Structure Prediction (CASP) event—and limitations of the methodology were also noted. In this report, we evaluate the nature and difficulty in modeling unassigned sparsely labeled NMR datasets and report on an improved methodological pipeline leading to higher-accuracy predictions. We benchmark our methodology against the NMR datasets provided by CASP 13.
Peptides are prevalent in biology, mediating as many as 40% of protein-protein interactions, and involved in other cellular functions such as transport and signaling. Their ability to bind with high specificity make them promising therapeutical agents with intermediate properties between small molecules and large biologics. Beyond their biological role, peptides can be programmed to self-assembly, and they are already being used for functions as diverse as oligonuclotide delivery, tissue regeneration or as drugs. However, the transient nature of their interactions has limited the number of structures and knowledge of binding affinities available–and their flexible nature has limited the success of computational pipelines that predict the structures and affinities of these molecules. Fortunately, recent advances in experimental and computational pipelines are creating new opportunities for this field. We are starting to see promising predictions of complex structures, thermodynamic and kinetic properties. We believe in the following years this will lead to robust rational peptide design pipelines with success similar to those applied for small molecule drug discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.