Unlike random heteropolymers, natural proteins fold into unique ordered structures. Understanding how these are encoded in amino-acid sequences is complicated by energetically unfavourable non-ideal features—for example kinked α-helices, bulged β-strands, strained loops and buried polar groups—that arise in proteins from evolutionary selection for biological function or from neutral drift. Here we describe an approach to designing ideal protein structures stabilized by completely consistent local and non-local interactions. The approach is based on a set of rules relating secondary structure patterns to protein tertiary motifs, which make possible the design of funnel-shaped protein folding energy landscapes leading into the target folded state. Guided by these rules, we designed sequences predicted to fold into ideal protein structures consisting of α-helices, β-strands and minimal loops. Designs for five different topologies were found to be monomeric and very stable and to adopt structures in solution nearly identical to the computational models. These results illuminate how the folding funnels of natural proteins arise and provide the foundation for engineering a new generation of functional proteins free from natural evolution.
Degeneracy in the genetic code, which enables a single protein to be encoded by a multitude of synonymous gene sequences, has an important role in regulating protein expression, but substantial uncertainty exists concerning the details of this phenomenon. Here we analyze the sequence features influencing protein expression levels in 6,348 experiments using bacteriophage T7 polymerase to synthesize messenger RNA in Escherichia coli. Logistic regression yields a new codon-influence metric that correlates only weakly with genomic codon-usage frequency, but strongly with global physiological protein concentrations and also mRNA concentrations and lifetimes in vivo. Overall, the codon content influences protein expression more strongly than mRNA-folding parameters, although the latter dominate in the initial ~16 codons. Genes redesigned based on our analyses are transcribed with unaltered efficiency but translated with higher efficiency in vitro. The less efficiently translated native sequences show greatly reduced mRNA levels in vivo. Our results suggest that codon content modulates a kinetic competition between protein elongation and mRNA degradation that is a central feature of the physiology and also possibly the regulation of translation in E. coli.
In selecting a method to produce a recombinant protein, a researcher is faced with a bewildering array of choices as to where to start. To facilitate decision-making, we describe a consensus 'what to try first' strategy based on our collective analysis of the expression and purification of over 10,000 different proteins. This review presents methods that could be applied at the outset of any project, a prioritized list of alternate strategies and a list of pitfalls that trip many new investigators.
Tryptophan 2,3-dioxygenase (TDO) and indoleamine 2,3-dioxygenase (IDO) constitute an important, yet relatively poorly understood, family of heme-containing enzymes. Here, we report extensive structural and biochemical studies of the Xanthomonas campestris TDO and a related protein SO4414 from Shewanella oneidensis, including the structure at 1.6-Å resolution of the catalytically active, ferrous form of TDO in a binary complex with the substrate L-Trp. The carboxylate and ammonium moieties of tryptophan are recognized by electrostatic and hydrogen-bonding interactions with the enzyme and a propionate group of the heme, thus defining the L-stereospecificity. A second, possibly allosteric, L-Trp-binding site is present at the tetramer interface. The sixth coordination site of the heme-iron is vacant, providing a dioxygenbinding site that would also involve interactions with the ammonium moiety of L-Trp and the amide nitrogen of a glycine residue. The indole ring is positioned correctly for oxygenation at the C2 and C3 atoms. The active site is fully formed only in the binary complex, and biochemical experiments confirm this induced-fit behavior of the enzyme. The active site is completely devoid of water during catalysis, which is supported by our electrochemical studies showing significant stabilization of the enzyme upon substrate binding.cancer ͉ heme enzymes ͉ immunomodulation ͉ indoleamine 2,3-dioxygenase T ryptophan 2,3-dioxygenase (TDO) and indoleamine 2,3-dioxygenase (IDO) catalyze the oxidative cleavage of the L-tryptophan (L-Trp) pyrrole ring, the first and rate-limiting step in L-Trp catabolism through the kynurenine pathway (1-3). In addition, IDO has been implicated in a diverse range of physiological and pathological conditions, including suppression of T cell proliferation, maternal tolerance to allogenic fetus, and immune escape of cancers (4-8), and is an attractive target for drug discovery against cancer and autoimmune and other diseases (2, 9-12).Despite catalyzing identical biochemical reactions (Fig. 1a), the sequence similarity between TDO and IDO is extremely low. An alignment of their sequences is only possible based on their structures, which suggests a sequence identity of 10% between them (Fig. 1b). In comparison, Xanthomonas campestris TDO shares 34% sequence identity with human TDO (Fig. 1b), demonstrating the remarkable evolutionary conservation of this enzyme. TDO is a homotetrameric enzyme and is highly specific for L-Trp and related derivatives such as 6-fluoro-Trp as the substrate. In comparison, IDO is monomeric, and shows activity toward a larger collection of substrates, including L-Trp, Dtryptophan (D-Trp), serotonin, and tryptamine (3), although the K m for D-Trp is Ϸ100-fold higher than that for L-Trp (13). The structure of human IDO in the catalytically inactive, ferric [Fe(III)]-heme state in complex with the 4-phenylimidazole inhibitor has recently been reported (14). Although this structure gave information about important active site residues, the inhibitor is coordinat...
Salicylic acid (SA) is a critical signal for the activation of plant defense responses against pathogen infections. We recently identified SA-binding protein 2 (SABP2) from tobacco as a protein that displays high affinity for SA and plays a crucial role in the activation of systemic acquired resistance to plant pathogens. Here we report the crystal structures of SABP2, alone and in complex with SA at up to 2.1-Å resolution. The structures confirm that SABP2 is a member of the ␣͞ hydrolase superfamily of enzymes, with Ser-81, His-238, and Asp-210 as the catalytic triad. SA is bound in the active site and is completely shielded from the solvent, consistent with the high affinity of this compound for SABP2. Our biochemical studies reveal that SABP2 has strong esterase activity with methyl salicylate as the substrate, and that SA is a potent product inhibitor of this catalysis. Modeling of SABP2 with MeSA in the active site is consistent with all these biochemical observations. Our results suggest that SABP2 may be required to convert MeSA to SA as part of the signal transduction pathways that activate systemic acquired resistance and perhaps local defense responses as well.salicylic acid ͉ salicylic-acid-binding protein ͉ systemic acquired resistance ͉ ␣͞ hydrolase
We have developed an approach for determining NMR structures of proteins over 20 kDa that utilizes sparse distance restraints obtained using transverse relaxation optimized spectroscopy experiments on perdeuterated samples to guide RASREC Rosetta NMR structure calculations. The method was tested on 11 proteins ranging from 15 to 40 kDa, seven of which were previously unsolved. The RASREC Rosetta models were in good agreement with models obtained using traditional NMR methods with larger restraint sets. In five cases X-ray structures were determined or were available, allowing comparison of the accuracy of the Rosetta models and conventional NMR models. In all five cases, the Rosetta models were more similar to the X-ray structures over both the backbone and side-chain conformations than the "best effort" structures determined by conventional methods. The incorporation of sparse distance restraints into RASREC Rosetta allows routine determination of high-quality solution NMR structures for proteins up to 40 kDa, and should be broadly useful in structural biology.nuclear magnetic resonance | sparse data | maltose binding protein | structural genomics | genetic algorithms A dvances in hardware, sample preparation, pulse sequence development, and refinement techniques have expanded the size and complexity of proteins accessible to structure determination by solution-state NMR to include proteins that, until recently, were exclusively the realm of X-ray crystallography (1-3). However, despite a number of landmark studies (4-7), only a small percentage of structures solved by NMR and deposited in the Protein Data Bank exceed 20 kDa in molecular weight. Larger structures need to be assembled by combining structural information from individual domains, and require additional techniques to elucidate the spatial arrangement, such as shape fitting (5) and/or paramagnetic restraints (8).The 20-kDa general limit coincides with the two fundamental problems in solution-state NMR: resonance overlap and progressive increase in the transverse relaxation rate (1∕T 2 ). As the size of a molecule increases, so does the rotational correlation time and, consequently, the efficiency of 1 H-1 H relaxation mechanisms. One way to suppress these effects is to incorporate deuterium into the protein sample, diluting the 1 H-1 H relaxation networks and increasing 13 C and 15 N relaxation times, resulting in sharper line widths and dramatic improvement of the signalto-noise ratios (2, 9, 10). Perdeuteration is generally required for studies of larger proteins (11-14), particularly membrane proteins (15, 16).Unfortunately, deuteration also eliminates the majority of 1 H-1 H NOEs, the main source of long-range distance information in solution-state NMR. Several methods have emerged for reintroducing protons at selected sites to function as distance probes in the structure (11,17). Methyl groups of isoleucine δ1, leucine, and valine side chains are straightforward to label with 13 C and 1 H isotopes in an otherwise deuterated protein sample (12, ...
Overexpression of proteins in Escherichia coli at low temperature improves their solubility and stability. Here, we apply the unique features of the cspA gene to develop a series of expression vectors, termed pCold vectors, that drive the high expression of cloned genes upon induction by cold-shock. Several proteins were produced with very high yields, including E. coli EnvZ ATP-binding domain (EnvZ-B) and Xenopus laevis calmodulin (CaM). The pCold vector system can also be used to selectively enrich target proteins with isotopes to study their properties in cell lysates using NMR spectroscopy. We have cloned 38 genes from a range of prokaryotic and eukaryotic organisms into both pCold and pET14 (ref. 3) systems, and found that pCold vectors are highly complementary to the widely used pET vectors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.