Intrinsically disordered proteins (IDP) are important in a broad range of biological functions and are involved in many diseases. An understanding of intrinsic disorder is key to develop drugs against IDPs. Experimental characterization of IDPs are expensive and less efficient and demand the development of computational tools. Here, we present ADOPT, a new predictor of protein disorder. ADOPT is a deep bidirectional transformer, which extracts dense residue level representations from Facebook’s Evolutionary Scale Modeling (ESM) library. Using the experimentally designed CheZod database as a training and test dataset for protein disorder, it predicts Z scores and protein disorder with new state-of-the-art performance in a few seconds. We show that ADOPT offers substantial improvement in comparison to previous predictors with a Spearman correlation coefficient between experimental and computational Z scores of 0.69. We identify the coordinates which are relevant for the prediction performance and show that good performance can already gained with less than 100 features. We believe that ADOPT will be a useful tool for all experimental scientists working with intrinsically disordered proteins. It is available as a standalone package at https://github.com/PeptoneInc/ADOPT.git.
Conformationally controlled flexible molecules are ideal for applications in medicine and materials, where shape matters but an ability to adapt to multiple and changing environments is often required. The conformation of flexible hydrocarbon chains bearing contiguous methyl substituents is controlled through the avoidance of syn-pentane interactions: alternating syn–anti isomers adopt a linear conformation while all-syn isomers adopt a helical conformation. From a simple diamond lattice analysis, larger substituents, which would be required for most potential applications, result in significant and unavoidable syn-pentane interactions, suggesting substantially reduced conformational control. Through a combination of computation, synthesis, and NMR analysis, we have identified a selection of substitution patterns that allow large groups to be incorporated on conformationally controlled linear and helical hydrocarbon chains. Surprisingly, when the methyl substituents of alternating syn–anti hydrocarbons are replaced with acetoxyethyl groups, the main chain of almost 95% of the population of molecules adopt a linear conformation. Here, the side chains adopt nonideal eclipsed conformations with the main chain, thus minimizing syn-pentane interactions. In the case of all-syn hydrocarbons, concurrent removal of some methyl groups on the main chain adjacent to the large substituents is required to maintain a high population of molecules adopting a helical conformation. This information can now be used to design flexible hydrocarbon chains displaying functional groups in a defined relative orientation for multivalent binding or cooperative reactivity, for example, in targeting the interfaces defined by disease-relevant protein–protein interactions.
Intrinsically disordered proteins (IDPs) are important for a broad range of biological functions and are involved in many diseases. An understanding of intrinsic disorder is key to develop compounds that target IDPs. Experimental characterization of IDPs is hindered by the very fact that they are highly dynamic. Computational methods that predict disorder from the amino acid sequence have been proposed. Here, we present ADOPT (Attention DisOrder PredicTor), a new predictor of protein disorder. ADOPT is composed of a self-supervised encoder and a supervised disorder predictor. The former is based on a deep bidirectional transformer, which extracts dense residue-level representations from Facebook’s Evolutionary Scale Modeling library. The latter uses a database of nuclear magnetic resonance chemical shifts, constructed to ensure balanced amounts of disordered and ordered residues, as a training and a test dataset for protein disorder. ADOPT predicts whether a protein or a specific region is disordered with better performance than the best existing predictors and faster than most other proposed methods (a few seconds per sequence). We identify the features that are relevant for the prediction performance and show that good performance can already be gained with <100 features. ADOPT is available as a stand-alone package at https://github.com/PeptoneLtd/ADOPT and as a web server at https://adopt.peptone.io/.
MotivationAccurate modelling of protein ensembles requires sampling of a large number of 3D conformations. A number of sampling approaches that use internal coordinates have been proposed, yet poor performance in the conversion from internal to Cartesian coordinates limits their applicability.ResultsWe describe here NeRFax, an efficient method for the conversion from internal to Cartesian coordinates that utilizes the platform-agnostic JAX Python library. The relative benefit of NeRFax is demonstrated here, on peptide chain reconstruction tasks. Our novel approach offers 35-175x times performance gains compared to previous state-of-the-art methods, whereas >10,000x speedup is reported in a reconstruction of a biomolecular condensate of 1,000 chains.AvailabilityNeRFax has purely open-source dependencies and is available at https://github.com/PeptoneInc/nerfax.Contactoliver@peptone.ioSupplementary informationSupplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.