Motivation: Identifying protein enzymatic or pharmacological activities are important areas of research in biology and chemistry. Biological and chemical databases are increasingly being populated with linkages between protein sequences and chemical structures.There is now sufficient information to apply machine-learning techniques to predict interactions between chemicals and proteins at a genome scale. Current machine-learning techniques use as input either protein sequences and structures or chemical information. We propose here a method to infer protein-chemical interactions using heterogeneous input consisting of both protein sequence and chemical information. Results: Our method relies on expressing proteins and chemicals with a common cheminformatics representation. We demonstrate our approach by predicting whether proteins can catalyze reactions not present in training sets. We also predict whether a given drug can bind a target, in the absence of prior binding information for that drug and target. Such predictions cannot be made with current machine-learning techniques requiring binding information for individual reactions or individual targets. Availability and Contact: For questions, paper reprints, please contact Jean-Loup Faulon at jfaulon@sandia.gov. Additional information on the signature molecular descriptor and codes can be downloaded at:
Current SDSL-EPR methods allow measurement of dipolar distances in the 8-70 A range; however, the use of extrinsic probes complicates the interpretation of these distances in modeling macromolecular structure and conformational changes. The data presented here show that interprobe distances correlate only weakly with Cbeta-Cbeta distances, especially for distances that are on the order of the spin label tether lengths. Explicitly incorporating the spin label into the modeling process increases the experiment/model correlation 4-fold and reduces the distance error from 6 A to 3 A.
Electron paramagnetic resonance (EPR) is often used in the study of the orientation and dynamics of proteins. However, there are two major obstacles in the interpretation of EPR signals: (a) most spin labels are not fully immobilized by the protein, hence it is difficult to distinguish the mobility of the label with respect to the protein from the reorientation of the protein itself; (b) even in cases where the label is fully immobilized its orientation with respect to the protein is not known, which prevents interpretation of probe reorientation in terms of protein reorientation. We have developed a computational strategy for determining whether or not a spin label is immobilized and, if immobilized, predicting its conformation within the protein. The method uses a Monte Carlo minimization algorithm to search the conformational space of labels within known atomic level structures of proteins. To validate the method a series of spin labels of varying size and geometry were docked to sites on the myosin head catalytic and regulatory domains. The predicted immobilization and conformation compared well with
Site-directed spin labeling EPR (SDSL-EPRtroponin I ͉ spin labels ͉ Fourier transform electron paramagnetic resonance ͉ DEER ͉ dipolar R egulation of striated muscle contraction is associated with Ca 2ϩ -dependent structural transitions in the muscle thin filament, which is composed of the troponin complex, tropomyosin, and actin. Troponin is composed of three components: TnC, which binds Ca 2ϩ ; TnI, which inhibits actomyosin activity; and TnT, which anchors TnC and TnI to tropomyosin. Muscle contraction is initiated by the binding of Ca 2ϩ to the N-lobe regulatory sites of TnC. The N lobe then undergoes a structural transition from a closed to an open form, which then facilitates the release of the inhibitory region of TnI from actin, binding to TnC and stimulation of the actomyosin ATPase. The mechanism of this signaling pathway is still tentative (for review, see ref. 1).Crystal and NMR structures are available for TnC and several complexes of TnC with TnI fragments. The crystal structure of TnC reveals a dumbbell-shaped protein consisting of two globular domains, joined by a 22-residue central ␣-helix (2-4). Each domain contains a hydrophobic cleft and two helix-loop-helix EF-hand metal binding motifs; two high-affinity Ca 2ϩ ͞Mg 2ϩ sites in the C-terminal domain (sites III and IV) and two low-affinity Ca 2ϩ specific sites in the N-terminal domain (sites I and II). In skeletal TnC, sites III and IV are permanently occupied by Mg 2ϩ and facilitate the structural binding of TnC to the contractile apparatus (5). Binding of Ca 2ϩ to sites I and II is the physiological trigger for muscle contraction. In cardiac TnC, binding site I is inactive and Ca 2ϩ binding to site II does not induce as large a structural change as observed in skeletal TnC (6). TnI is capable of inhibiting actomyosin ATPase in the absence of other subunits, but Ca 2ϩ -dependent regulation requires TnC, TnT, and tropomyosin. The inhibitory region (skTnI 96-117) alone can fully inhibit actomyosin ATPase activity (7) possibly by binding either to actin or to TnC in the ''on'' or ''off'' states (8-10). The corresponding residues for the cardiac inhibitory region are 129-150 because of a unique Ϸ32 residue N-terminal extension of cTnI.Structural information for the intact troponin complex is limited to low-resolution neutron diffraction and electron microscopy studies (11-13), though several high-resolution structures of TnC with bound TnI peptides are available. Two computational models have been recently proposed for the binary complex of TnC and TnI (14,15). Both models have TnI and TnC in an antiparallel arrangement with multiple interaction sites between the two subunits. NMR, crystallography, and neutron scattering was used by Tung et al. (15) to develop a computational model of the binary complex in which TnI winds around TnC in either a left-handed manner (model ''L'') or a right-handed manner (model ''R''). In both structures, the inhibitory region of TnI is modeled as a flexible -hairpin in close proximity to the central helix of TnC. ...
Herein we present a computational technique for generating helix-membrane protein folds matching a predefined set of distance constraints, such as those obtained from NMR NOE, chemical cross-linking, dipolar EPR, and FRET experiments. The purpose of the technique is to provide initial structures for local conformational searches based on either energetic considerations or ad-hoc scoring criteria. In order to properly screen the conformational space, the technique generates an exhaustive list of conformations within a specified root-mean-square deviation (RMSD) where the helices are positioned in order to match the provided distances. Our results indicate that the number of structures decreases exponentially as the number of distances increases, and increases exponentially as the errors associated with the distances increases. We also found the number of solutions to be smaller when all the distances share one helix in common, compared to the case where the distances connect helices in a daisy-chain manner. We found that for 7 helices, at least 15 distances with errors up to 8 Å are needed to produce a number of solutions that is not too large to be processed by local search refinement procedures. Finally, without energetic considerations, our enumeration technique retrieved the transmembrane domains of Bacteriorhodopsin (PDB entry1c3w), Halorhodopsin (1e12), Rhodopsin (1f88), Aquaporin-1 (1fqy), Glycerol uptake facilitator protein (1fx8), Sensory Rhodopsin (1jgj), and a subunit of Fumarate reductase flavoprotein (1qlaC) with C␣ level RMSDs of 3.0 Å, 2.3 Å, 3.2 Å, 4.6 Å, 6.0 Å, 3.7 Å, and 4.4 Å, respectively.
We present a two-step approach to modeling the transmembrane spanning helical bundles of integral membrane proteins using only sparse distance constraints, such as those derived from chemical crosslinking, dipolar EPR and FRET experiments. In Step 1, using an algorithm, we developed, the conformational space of membrane protein folds matching a set of distance constraints is explored to provide initial structures for local conformational searches. In Step 2, these structures refined against a custom penalty function that incorporates both measures derived from statistical analysis of solved membrane protein structures and distance constraints obtained from experiments. We begin by describing the statistical analysis of the solved membrane protein structures from which the theoretical portion of the penalty function was derived. We then describe the penalty function, and, using a set of six test cases, demonstrate that it is capable of distinguishing helical bundles that are close to the native bundle from those that are far from the native bundle. Finally, using a set of only 27 distance constraints extracted from the literature, we show that our method successfully recovers the structure of dark-adapted rhodopsin to within 3.2 Å of the crystal structure.Keywords: helix packing; transmembrane helices; distance constraints; molecular refinement Integral membrane proteins are essential components of the cell membrane that participate in many important cellular processes such as energy transduction, cell signaling, mediation of senses such as vision, cell intoxication, and pathogenesis, and immune recognition. Their significance is emphasized by the fact that approximately one-third of the proteins encoded for by a typical genome are membrane proteins (Buchan et al. 2002). Furthermore, at least 70% of current pharmaceuticals are thought to act on membrane proteins (Wilson and Bergsma 2000). Despite their obvious importance, to date, the structures of fewer than 75 integral membrane proteins have been solved (see White 2003 and references therein), and this number includes redundant structures across species. This is a vast contrast to the over 25,000 soluble proteins whose structures have been solved using X-ray crystallography and NMR. Reasons for the slow progress in the structural analysis of membrane proteins include the instability of membrane proteins in environments lacking phospholipids, their tendency to aggregate and precipitate, and protein abundance, expression, and purification issues. These characteristics highlight why the application of standard structure determination methods to membrane proteins is nontrivial.Given the nature of the difficulties in generating highresolution structural data from methods such as X-ray crystallography and NMR, it is unlikely that these experimental techniques will yield a significant increase in the number of solved membrane protein structures in the near future. As an alternative approach, the focus here is on modeling transmembrane proteins using a set of s...
The relative movement of the catalytic and regulatory domains of the myosin head (S1) is likely to be the force generating conformational change in the energy transduction of muscle [Rayment, I., Holden, H. M., Whittaker, M., Yohn, C. B., Lorenz, M., Holmes, K. C., and Milligan, R. A. (1993) Science 261, 58-65]. To test this model we have measured, using frequency-modulated FRET, three distances between the catalytic domain and regulatory domains and within the regulatory domain of myosin. The donor/acceptor pairs included MHC cys707 and ELC cys177; ELC cys177 and RLC cys154; and ELC cys177 and gizzard RLC cys108. The IAEDANS (donor) or acceptor (DABMI or IAF) labeled light chains (ELC and RLC) were exchanged into monomeric myosin and the distances were measured in the putative prepower stroke states (in the presence of MgATP or ADP/AlF(4-)) and the postpower stroke states (ADP and the absence of nucleotides). For each of the three distances, the donor/acceptor pairs were reversed to minimize uncertainty in the distance measured, arising from probe orientational factors. The distances obtained from FRET were in close agreement with the distances in the crystal structure. Importantly, none of the measured distances varied by more than 2 A, putting a strong constraint on the extent of conformational changes within S1. The maximum axial movement of the distal part of myosin head was modeled using FRET distance changes within the myosin head reported here and previously. These models revealed an upper bound of 85 A for a swing of the regulatory domain with respect to the catalytic domain during the power stroke. Additionally, an upper bound of 22 A could be contributed to the power stroke by a reorientation of RLC with respect to the ELC during the power stroke.
Reorientation of the regulatory domain of the myosin head is a feature of all current models of force generation in muscle. We have determined the orientation of the myosin regulatory light chain (RLC) using a spin-label bound rigidly and stereospecifically to the single Cys-154 of a mutant skeletal isoform. Labeled RLC was reconstituted into skeletal muscle fibers using a modified method that results in near-stoichiometric levels of RLC and fully functional muscle. Complex electron paramagnetic resonance spectra obtained in rigor necessitated the development of a novel decomposition technique. The strength of this method is that no specific model for a complex orientational distribution was presumed. The global analysis of a series of spectra, from fibers tilted with respect to the magnetic field, revealed two populations: one well-ordered (+/-15 degrees ) with the spin-label z axis parallel to actin, and a second population with a large distribution (+/-60 degrees ). A lack of order in relaxed or nonoverlap fibers demonstrated that regulatory domain ordering was defined by interaction with actin rather than the thick filament surface. No order was observed in the regulatory domain during isometric contraction, consistent with the substantial reorientation that occurs during force generation. For the first time, spin-label orientation has been interpreted in terms of the orientation of a labeled domain. A Monte Carlo conformational search technique was used to determine the orientation of the spin-label with respect to the protein. This in turn allows determination of the absolute orientation of the regulatory domain with respect to the actin axis. The comparison with the electron microscopy reconstructions verified the accuracy of the method; the electron paramagnetic resonance determined that axial orientation was within 10 degrees of the electron microscopy model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.