Assigning valid functions to proteins identified in genome projects is challenging, with over-prediction and database annotation errors major concerns1. We, and others2, are developing computation-guided strategies for functional discovery using “metabolite docking” to experimentally derived3 or homology-based4 three-dimensional structures. Bacterial metabolic pathways often are encoded by “genome neighborhoods” (gene clusters and/or operons), which can provide important clues for functional assignment. We recently demonstrated the synergy of docking and pathway context by “predicting” the intermediates in the glycolytic pathway in E. coli5. Metabolite docking to multiple binding proteins/enzymes in the same pathway increases the reliability of in silico predictions of substrate specificities because the pathway intermediates are structurally similar. We report that structure-guided approaches for predicting the substrate specificities of several enzymes encoded by a bacterial gene cluster allowed i) the correct prediction of the in vitro activity of a structurally characterized enzyme of unknown function (PDB 2PMQ), 2-epimerization of trans-4-hydroxy-L-proline betaine (tHyp-B) and cis-4-hydroxy-D-proline betaine (cHyp-B), and ii) the correct identification of the catabolic pathway in which Hyp-B 2-epimerase participates. The substrate-liganded pose predicted by virtual library screening (docking) was confirmed experimentally. The enzymatic activities in the predicted pathway were confirmed by in vitro assays and genetic analyses; the intermediates were identified by metabolomics; and repression of the genes encoding the pathway by high salt was established by transcriptomics, confirming the osmolyte role of tHyp-B. This study establishes the utility of structure-guide functional predictions to enable the discovery of new metabolic pathways.
Metabolic pathways in eubacteria and archaea often are encoded by operons and/or gene clusters (genome neighborhoods) that provide important clues for assignment of both enzyme functions and metabolic pathways. We describe a bioinformatic approach (genome neighborhood network; GNN) that enables large scale prediction of the in vitro enzymatic activities and in vivo physiological functions (metabolic pathways) of uncharacterized enzymes in protein families. We demonstrate the utility of the GNN approach by predicting in vitro activities and in vivo functions in the proline racemase superfamily (PRS; InterPro IPR008794). The predictions were verified by measuring in vitro activities for 51 proteins in 12 families in the PRS that represent ∼85% of the sequences; in vitro activities of pathway enzymes, carbon/nitrogen source phenotypes, and/or transcriptomic studies confirmed the predicted pathways. The synergistic use of sequence similarity networks3 and GNNs will facilitate the discovery of the components of novel, uncharacterized metabolic pathways in sequenced genomes.DOI: http://dx.doi.org/10.7554/eLife.03275.001
Members of the mechanistically diverse enolase superfamily catalyze reactions that are initiated by abstraction of the alpha-proton of a carboxylate anion to generate an enolate anion intermediate that is stabilized by coordination to a Mg2+ ion. The catalytic groups, ligands for an essential Mg2+ and acid/base catalysts, are located in the (beta/alpha)8-barrel domain of the bidomain proteins. The assigned physiological functions in the muconate lactonizing enzyme (MLE) subgroup (Lys acid/base catalysts at the ends of the second and sixth beta-strands in the barrel domain) are cycloisomerization (MLE), dehydration (o-succinylbenzoate synthase; OSBS), and epimerization (L-Ala-D/L-Glu epimerase). We previously studied a putatively promiscuous member of the MLE subgroup with uncertain physiological function from Amycolatopsis that was discovered based on its ability to catalyze the racemization of N-acylamino acids (N-acylamino acid racemase; NAAAR) but also catalyzes the OSBS reaction [OSBS/NAAAR; Palmer, D. R., Garrett, J. B., Sharma, V., Meganathan, R., Babbitt, P. C., and Gerlt, J. A. (1999) Biochemistry 38, 4252-4258]. In this manuscript, we report functional characterization of a homologue of this protein encoded by the genome of Geobacillus kaustophilus as well as two other proteins that are encoded by the same operon, a divergent member of the Gcn5-related N-acetyltransferase (GNAT) superfamily of enzymes whose members catalyze the transfer an acyl group from an acyl-CoA donor to an amine acceptor, and a member of the M20 peptidase/carboxypeptidase G2 family. We determined that the member of the GNAT superfamily is succinyl-CoA:D-amino acid N-succinyltransferase, the member of the enolase superfamily is N-succinylamino acid racemase (NSAR), and the member of the M20 peptidase/carboxypeptidase G2 family is N-succinyl-L-amino acid hydrolase. We conclude that (1) these enzymes constitute a novel, irreversible pathway for the conversion of D- to L-amino acids and (2) the NSAR reaction is a new physiological function in the MLE subgroup. The NSAR is also functionally promiscuous and catalyzes an efficient OSBS reaction; intriguingly, the operon for menaquinone biosynthesis in G. kaustophilus does not encode an OSBS, raising the possibility that the NSAR is a bifunctional enzyme rather than an accidentally promiscuous enzyme.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.