The binding and catalytic functions of proteins are generally mediated by a small number of functional residues held in place by the overall protein structure. Here, we describe deep learning approaches for scaffolding such functional sites without needing to prespecify the fold or secondary structure of the scaffold. The first approach, “constrained hallucination,” optimizes sequences such that their predicted structures contain the desired functional site. The second approach, “inpainting,” starts from the functional site and fills in additional sequence and structure to create a viable protein scaffold in a single forward pass through a specifically trained RoseTTAFold network. We use these two methods to design candidate immunogens, receptor traps, metalloproteins, enzymes, and protein-binding proteins and validate the designs using a combination of in silico and experimental tests.
Enzymes in the α-D-phosphohexomutases (PHM) superfamily catalyze the reversible conversion of phosphosugars, such as glucose 1-phosphate and glucose 6-phosphate. These reactions are fundamental to primary metabolism across the kingdoms of life, and are required for a myriad of cellular processes, ranging from exopolysaccharide production to protein glycosylation. The subject of extensive mechanistic characterization during the latter half of the twentieth century, these enzymes have recently benefitted from biophysical characterization, including X-ray crystallography, NMR, and hydrogen-deuterium exchange studies. This work has provided new insights into the unique catalytic mechanism of the superfamily, shed light on the molecular determinants of ligand recognition, and revealed the evolutionary conservation of conformational flexibility. Novel associations with inherited metabolic disease and the pathogenesis of bacterial infections have emerged, spurring renewed interest in the long-appreciated functional roles of these enzymes.
Multi-resistant bacteria are a major threat in modern medicine. The gram-negative coccobacillus Acinetobacter baumannii currently leads the WHO list of pathogens in critical need for new therapeutic development. The maintenance of lipid asymmetry (MLA) protein complex is one of the core machineries that transport lipids from/to the outer membrane in gram-negative bacteria. It also contributes to broad-range antibiotic resistance in several pathogens, most prominently in A. baumannii. Nonetheless, the molecular details of its role in lipid transport has remained largely elusive. Here, we report the cryo-EM maps of the core MLA complex, MlaBDEF, from the pathogen A. baumannii, in the apo-, ATP- and ADP-bound states, revealing multiple lipid binding sites in the cytosolic and periplasmic side of the complex. Molecular dynamics simulations suggest their potential trajectory across the membrane. Collectively with the recently-reported structures of the E. coli orthologue, this data also allows us to propose a molecular mechanism of lipid transport by the MLA system.
The key metabolic enzyme phosphoglucomutase 1 (PGM1) controls glucose homeostasis in most human cells. Four proteins related to PGM1, known as PGM2, PGM2L1, PGM3 and PGM5, and referred to herein as paralogs, are encoded in the human genome. Although all members of the same enzyme superfamily, these proteins have distinct substrate preferences and different functional roles. The recent association of PGM1 and PGM3 with inherited enzyme deficiencies prompts us to revisit sequence-structure and other relationships among the PGM1 paralogs, which are understudied despite their importance in human biology. Using currently available sequence, structure, and expression data, we investigated evolutionary relationships, tissue-specific expression profiles, and the amino acid preferences of key active site motifs. Phylogenetic analyses indicate both ancient and more recent divergence between the different enzyme sub-groups comprising the human paralogs. Tissue-specific protein and RNA expression profiles show widely varying patterns for each paralog, providing insight into function and disease pathology. Multiple sequence alignments confirm high conservation of key active site regions, but also reveal differences related to substrate specificity. In addition, we find that sequence variants of PGM2, PGM2L1, and PGM5 verified in the human population affect residues associated with disease-related mutants in PGM1 or PGM3. This suggests that inherited diseases related to dysfunction of these paralogs will likely occur in humans.
24The maintenance of lipid asymmetry (MLA) system is involved in lipid transport from/to 25 the outer membrane in gram-negative bacteria, and contributes to broad-range 26 antibiotic resistance. Here, we report the cryo-EM structure of the A. baumannii 27 MlaBDEF core complex, in the apo, ADP-and AppNHp-bound states. This reveals 28 multiple lipid binding sites, and suggests a mechanism for their transport. 29 30 31Gram-negative bacteria are enveloped by two lipid bilayers, separated by the periplasmic 32 space containing the peptidoglycan cell wall. The two membranes have distinct lipid 33 composition: The inner membrane (IM) consists of glycerophospholipids, with both leaflets 34 having similar compositions, while the outer membrane (OM) is asymmetric, with an outer 35 leaflet of lipopolysaccharides (LPS) and an inner leaflet of glycerophospholipids (1). This lipid 36 gradient is maintained by several machineries, including YebT, PqiB, and the multicomponent 37MLA system (2, 3), which consists of MlaA present in the OM (4, 5), the shuttle MlaC in the 38 periplasmic space, and the MlaBDEF ABC transporter system in the IM (6). The directionality 39 of lipid transport by the MLA system has been the subject of debate, with initial reports 40 suggesting that it recycles lipids from the OM to the IM (7), but recent results (6, 8) indicated 41 that it might exports glycerophospholipids to the outer membrane. Low-resolution cryo-EM 42 maps of the MlaBDEF core complex, from Escherichia coli (2) and Acinetobacter baumannii 43 (6) have revealed the overall architecture of the complex, but did not allow to elucidate the 44 molecular details of lipid binding and transport. 45To address the mechanism of MLA functioning, we have determined the cryo-EM structure of 46 the A. baumannii MlaBDEF complex, bound to the non-hydrolizable ATP analogue AppNHp, 47
Advances in cryo-electron microscopy (cryoEM) and deep-learning guided protein structure prediction have expedited structural studies of protein complexes. However, methods for accurately determining ligand conformations are lacking. In this manuscript, we develop a tool for automatically determining ligand structures guided by medium-resolution cryoEM density. We show this method is robust at predicting ligands in maps as low as 6Å resolution, and is able to correct receptor sidechain errors. Combining this with a measure of placement confidence, and running on all protein/ligand structures in EMDB, we show that 58% of ligands replicate the deposited model, 16% confidently find alternate conformations, 22% have ambiguous density where multiple conformations might be present, and 4% are incorrectly placed. For five cases where our approach finds an alternate conformation with high confidence, high-resolution crystal structures validate our placement. This tool and the resulting analysis should prove critical in using cryoEM to investigate protein-ligand complexes.
Advances in cryo-electron microscopy (cryoEM) and deep-learning guided protein structure prediction have expedited structural studies of protein complexes. However, methods for accurately determining ligand conformations are lacking. In this manuscript, we develop EMERALD, a tool for automatically determining ligand structures guided by medium-resolution cryoEM density. We show this method is robust at predicting ligands along with surrounding side chains in maps as low as 4.5 Å local resolution. Combining this with a measure of placement confidence and running on all protein/ligand structures in the EMDB, we show that 57% of ligands replicate the deposited model, 16% confidently find alternate conformations, 22% have ambiguous density where multiple conformations might be present, and 5% are incorrectly placed. For five cases where our approach finds an alternate conformation with high confidence, high-resolution crystal structures validate our placement. EMERALD and the resulting analysis should prove critical in using cryoEM to solve protein-ligand complexes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.