Degeneracy in the genetic code, which enables a single protein to be encoded by a multitude of synonymous gene sequences, has an important role in regulating protein expression, but substantial uncertainty exists concerning the details of this phenomenon. Here we analyze the sequence features influencing protein expression levels in 6,348 experiments using bacteriophage T7 polymerase to synthesize messenger RNA in Escherichia coli. Logistic regression yields a new codon-influence metric that correlates only weakly with genomic codon-usage frequency, but strongly with global physiological protein concentrations and also mRNA concentrations and lifetimes in vivo. Overall, the codon content influences protein expression more strongly than mRNA-folding parameters, although the latter dominate in the initial ~16 codons. Genes redesigned based on our analyses are transcribed with unaltered efficiency but translated with higher efficiency in vitro. The less efficiently translated native sequences show greatly reduced mRNA levels in vivo. Our results suggest that codon content modulates a kinetic competition between protein elongation and mRNA degradation that is a central feature of the physiology and also possibly the regulation of translation in E. coli.
The second messenger cyclic diguanylate (c-di-GMP) controls the transition between motile and sessile growth in eubacteria, but little is known about the proteins that sense its concentration. Bioinformatics analyses suggested that PilZ domains bind c-di-GMP and allosterically modulate effector pathways. We have determined a 1.9 Å crystal structure of c-di-GMP bound to VCA0042/PlzD, a PilZ domain-containing protein from Vibrio cholerae. Either this protein or another specific PilZ domain-containing protein is required for V. cholerae to efficiently infect mice. VCA0042/PlzD comprises a C-terminal PilZ domain plus an N-terminal domain with a similar b-barrel fold. C-di-GMP contacts seven of the nine strongly conserved residues in the PilZ domain, including three in a seven-residue long N-terminal loop that undergoes a conformational switch as it wraps around c-di-GMP. This switch brings the PilZ domain into close apposition with the N-terminal domain, forming a new allosteric interaction surface that spans these domains and the c-di-GMP at their interface. The very small size of the N-terminal conformational switch is likely to explain the facile evolutionary diversification of the PilZ domain.
Crystallization has proven to be the most significant bottleneck to high-throughput protein structure determination using diffraction methods. We have used the large-scale, systematically generated experimental results of the Northeast Structural Genomics Consortium to characterize the biophysical properties that control protein crystallization. Datamining of crystallization results combined with explicit folding studies lead to the conclusion that crystallization propensity is controlled primarily by the prevalence of well-ordered surface epitopes capable of mediating interprotein interactions and is not strongly influenced by overall thermodynamic stability. These analyses identify specific sequence features correlating with crystallization propensity that can be used to estimate the crystallization probability of a given construct. Analyses of entire predicted proteomes demonstrate substantial differences in the bulk amino acid sequence properties of human versus eubacterial proteins that reflect likely differences in their biophysical properties including crystallization propensity. Finally, our thermodynamic measurements enable critical evaluation of previous claims regarding correlations between protein stability and bulk sequence properties, which generally are not supported by our dataset. NIH Public Access Author ManuscriptNat Biotechnol. Author manuscript; available in PMC 2010 January 1. Published in final edited form as:Nat Biotechnol. 2009 January ; 27(1): 51-57. doi:10.1038/nbt.1514. NIH-PA Author ManuscriptNIH-PA Author Manuscript NIH-PA Author ManuscriptThe ability to determine the atomic structures of macromolecules represents a great achievement in molecular biology because of the unparalleled value of this information in understanding the fundamental chemistry of life [1][2][3][4][5] . While nuclear magnetic resonance represents an invaluable source of structural information, especially for small proteins, most macromolecular structures are determined using x-ray crystallography. Capitalizing on the recent proliferation of genomic sequence data, "structural genomics" consortia have been organized worldwide to develop methods and infrastructure for high-throughput protein structure determination. These groups have contributed to improvements in expression and structure determination methods 6 , and the four largest U.S. consortia accounted for 45% of all novel structures deposited in the Protein Data Bank (PDB) in 2007 7 . While these efforts contribute to the impressive progress of the structural biology community in characterizing the full repertoire of protein structures, the rate of growth in sequence information nonetheless far out-paces that of structural information. Given the ongoing acceleration of whole-genome sequencing, the gap between the two will continue to expand without a breakthrough in macromolecular structure determination methods.The systematic efforts of structural genomics projects show that crystallization is the major bottleneck to protein structure determinati...
Ribosomes are large ribonucleoprotein complexes that catalyze the peptidyltransferase reaction in protein synthesis and are thus responsible for the translation of transcripts encoded in the cellular genome. Detailed analyses of eukaryote and prokaryote ribosomes by peptide mass spectrometry provide insights into the composition of ribosomal proteins and show a high degree of posttranslational modifications (1). These modifications are believed to extend molecular structures beyond the limits imposed by the 20 genetically encoded amino acids (2). For example, the Escherichia coli ribosomal protein S12 is shown to be post-translationally modified through 3-methylthiolation of the Asp-89 3 residue (Scheme 1A), a modification believed to improve translational accuracy (3, 4). Recently, the yliG gene (later named rimO for ribosomal modification O) has been shown to be responsible for this reaction in vivo (5). The protein encoded by this gene, RimO, contains in its central part the highly conserved cysteine triad Cys-XXX-Cys-XX-Cys, which is the hallmark of the radical AdoMet 4 superfamily (Scheme 1B) (6).Radical AdoMet enzymes share a common mechanism that utilizes a 2ϩ/1ϩ cluster chelated by the three cysteines of the triad and by AdoMet to initiate, under reducing conditions, a radical reaction mediated by a 5Ј-deoxyadenosyl radical arising from the reductive cleavage of the bound AdoMet (7,8). This radical abstracts a hydrogen atom from a properly positioned substrate creating a substrate-based carbon radical. In the formation of 3-methylthioaspartate at Asp-89 of the S12 protein (ms-D89-S12), this radical is supposed to be located on the C3 of Asp-89, and then the site becomes successively thiolated and methylated (9).Several other radical AdoMet enzymes besides RimO catalyze the thiolation of substrates, including a tRNA-methylthio-
The 1.8 A resolution de novo structure of nucleoside 2-deoxyribosyltransferase (EC 2.4.2.6) from Trypanosoma brucei (TbNDRT) has been determined by SADa phasing in an unliganded state and several ligand-bound states. This enzyme is important in the salvage pathway of nucleoside recycling. To identify novel lead compounds, we exploited "fragment cocktail soaks". Out of 304 compounds tried in 31 cocktails, four compounds could be identified crystallographically in the active site. In addition, we demonstrated that very short soaks of approximately 10 s are sufficient even for rather hydrophobic ligands to bind in the active site groove, which is promising for the application of similar soaking experiments to less robust crystals of other proteins.
SummaryMalonyl-coenzyme A decarboxylase (MCD) is found from bacteria to humans, has important roles in regulating fatty acid metabolism and food intake, and is an attractive target for drug discovery. We report here four crystal structures of MCD from human, Rhodopseudomonas palustris, Agrobacterium vitis, and Cupriavidus metallidurans at up to 2.3 Å resolution. The MCD monomer contains an N-terminal helical domain involved in oligomerization and a C-terminal catalytic domain. The four structures exhibit substantial differences in the organization of the helical domains and, consequently, the oligomeric states and intersubunit interfaces. Unexpectedly, the MCD catalytic domain is structurally homologous to those of the GCN5-related N-acetyltransferase superfamily, especially the curacin A polyketide synthase catalytic module, with a conserved His-Ser/Thr dyad important for catalysis. Our structures, along with mutagenesis and kinetic studies, provide a molecular basis for understanding pathogenic mutations and catalysis, as well as a template for structure-based drug design.
Recent studies of signal transduction in bacteria have revealed a unique second messenger, bis-(3′-5′)-cyclic dimeric GMP (c-di-GMP), which regulates transitions between motile states and sessile states, such as biofilms. C-di-GMP is synthesized from two GTP molecules by diguanylate cyclases (DGC). The catalytic activity of DGCs depends on a conserved GG(D/E)EF domain, usually part of a larger multi-domain protein organization. The domains other than the GG(D/E)EF domain often control DGC activation. This paper presents the 1.83 Å crystal structure of an isolated catalytically competent GG(D/E)EF domain from the A1U3W3_MARAV protein from Marinobacter aquaeolei. Co-crystallization with GTP resulted in enzymatic synthesis of c-di-GMP. Comparison with previously solved DGC structures shows a similar orientation of c-di-GMP bound to an allosteric regulatory site mediating feedback inhibition of the enzyme. Biosynthesis of c-di-GMP in the crystallization reaction establishes that the enzymatic activity of this DGC domain does not require interaction with regulatory domains.
We report here the crystal structure at 2.0 Å resolution of the AGR_C_4470p protein from the Gramnegative bacterium Agrobacterium tumefaciens. The protein is a tightly associated dimer, each subunit of which bears strong structural homology with the two domains of the heme utilization protein ChuS from Escherichia coli and HemS from Yersinia enterocolitica. Remarkably, the organization of the AGR_C_4470p dimer is the same as that of the two domains in ChuS and HemS, providing structural evidence that these two proteins evolved by gene duplication. However, the binding site for heme, while conserved in HemS and ChuS, is not conserved in AGR_C_4470p, suggesting that it probably has a different function. This is supported by the presence of two homologs of AGR_C_4470p in E. coli, in addition to the ChuS protein.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.