FTMap is a computational mapping server that identifies binding hot spots of macromolecules, i.e., regions of the surface with major contributions to the ligand binding free energy. To use FTMap, users submit a protein, DNA, or RNA structure in PDB format. FTMap samples billions of positions of small organic molecules used as probes and scores the probe poses using a detailed energy expression. Regions that bind clusters of multiple probe types identify the binding hot spots, in good agreement with experimental data. FTMap serves as basis for other servers, namely FTSite to predict ligand binding sites, FTFlex to account for side chain flexibility, FTMap/param to parameterize additional probes, and FTDyn to map ensembles of protein structures. Applications include determining druggability of proteins, identifying ligand moieties that are most important for binding, finding the most bound-like conformation in ensembles of unliganded protein structures, and providing input for fragment based drug design. FTMap is more accurate than classical mapping methods such as GRID and MCSS, and is much faster than the more recent approaches to protein mapping based on mixed molecular dynamics. Using 16 probe molecules, the FTMap server finds the hot spots of an average size protein in less than an hour. Since FTFlex performs mapping for all low energy conformers of side chains in the binding site, its completion time is proportionately longer.
Molecular dynamics (MD) simulations of proteins reveal the existence of many transient surface pockets; however, the factors determining what small subset of these represent druggable or functionally relevant ligand binding sites, called "cryptic sites," are not understood. Here, we examine multiple X-ray structures for a set of proteins with validated cryptic sites, using the computational hot spot identification tool FTMap. The results show that cryptic sites in ligand-free structures generally have a strong binding energy hot spot very close by. As expected, regions around cryptic sites exhibit above-average flexibility, and close to 50% of the proteins studied here have unbound structures that could accommodate the ligand without clashes. Nevertheless, the strong hot spot neighboring each cryptic site is almost always exploited by the bound ligand, suggesting that binding may frequently involve an induced fit component. We additionally evaluated the structural basis for cryptic site formation, by comparing unbound to bound structures. Cryptic sites are most frequently occluded in the unbound structure by intrusion of loops (22.5%), side chains (19.4%), or in some cases entire helices (5.4%), but motions that create sites that are too open can also eliminate pockets (19.4%). The flexibility of cryptic sites frequently leads to missing side chains or loops (12%) that are particularly evident in low resolution crystal structures. An interesting observation is that cryptic sites formed solely by the movement of side chains, or of backbone segments with fewer than five residues, result only in low affinity binding sites with limited use for drug discovery.
Cigarette smoke creates a molecular field of injury in epithelial cells that line the respiratory tract. We hypothesized that transcriptome sequencing (RNA-Seq) will enhance our understanding of the field of molecular injury in response to tobacco smoke exposure and lung cancer pathogenesis by identifying gene expression differences not interrogated or accurately measured by microarrays. We sequenced the high-molecular-weight fraction of total RNA (>200 nt) from pooled bronchial airway epithelial cell brushings (n = 3 patients per pool) obtained during bronchoscopy from healthy never smoker (NS) and current smoker (S) volunteers and smokers with (C) and without (NC) lung cancer undergoing lung nodule resection surgery. RNA-Seq libraries were prepared using 2 distinct approaches, one capable of capturing non-polyadenylated RNA (the prototype NuGEN Ovation RNA-Seq protocol) and the other designed to measure only polyadenylated RNA (the standard Illumina mRNA-Seq protocol) followed by sequencing generating approximately 29 million 36 nt reads per pool and approximately 22 million 75 nt paired-end reads per pool, respectively. The NuGEN protocol captured additional transcripts not detected by the Illumina protocol at the expense of reduced coverage of polyadenylated transcripts, while longer read lengths and a paired-end sequencing strategy significantly improved the number of reads that could be aligned to the genome. The aligned reads derived from the two complementary protocols were used to define the compendium of genes expressed in the airway epithelium (n = 20,573 genes). Pathways related to the metabolism of xenobiotics by cytochrome P450, retinol metabolism, and oxidoreductase activity were enriched among genes differentially expressed in smokers, whereas chemokine signaling pathways, cytokine–cytokine receptor interactions, and cell adhesion molecules were enriched among genes differentially expressed in smokers with lung cancer. There was a significant correlation between the RNA-Seq gene expression data and Affymetrix microarray data generated from the same samples (P < 0.001); however, the RNA-Seq data detected additional smoking- and cancer-related transcripts whose expression was were either not interrogated by or was not found to be significantly altered when using microarrays, including smoking-related changes in the inflammatory genes S100A8 and S100A9 and cancer-related changes in MUC5AC and secretoglobin (SCGB3A1). Quantitative real-time PCR confirmed differential expression of select genes and non-coding RNAs within individual samples. These results demonstrate that transcriptome sequencing has the potential to provide new insights into the biology of the airway field of injury associated with smoking and lung cancer. The measurement of both coding and non-coding transcripts by RNA-Seq has the potential to help elucidate mechanisms of response to tobacco smoke and to identify additional biomarkers of lung cancer risk and novel targets for chemoprevention.
Fragment-based drug discovery (FBDD) relies on the premise that the fragment binding mode will be conserved on subsequent expansion to a larger ligand. However, no general condition has been established to explain when fragment binding modes will be conserved. We show that a remarkably simple condition can be developed in terms of how fragments coincide with binding energy hot spots—regions of the protein where interactions with a ligand contribute substantial binding free energy—the locations of which can easily be determined computationally. Because a substantial fraction of the free energy of ligand binding comes from interacting with the residues in the energetically most important hot spot, a ligand moiety that sufficiently overlaps with this region will retain its location even when other parts of the ligand are removed. This hypothesis is supported by eight case studies. The condition helps identify whether a protein is suitable for FBDD, predicts the size of fragments required for screening, and determines whether a fragment hit can be extended into a higher affinity ligand. Our results show that ligand binding sites can usefully be thought of in terms of an anchor site, which is the top-ranked hot spot and dominates the free energy of binding, surrounded by a number of weaker satellite sites that confer improved affinity and selectivity for a particular ligand and that it is the intrinsic binding potential of the protein surface that determines whether it can serve as a robust binding site for a suitably optimized ligand.
Development of small molecule inhibitors of protein−protein interactions (PPIs) is hampered by our poor understanding of the druggability of PPI target sites. Here, we describe the combined application of alanine-scanning mutagenesis, fragment screening, and FTMap computational hot spot mapping to evaluate the energetics and druggability of the highly charged PPI interface between Kelch-like ECH-associated protein 1 (KEAP1) and nuclear factor erythroid 2 like 2 (Nrf2), an important drug target. FTMap identifies four binding energy hot spots at the active site. Only two of these are exploited by Nrf2, which alanine scanning of both proteins shows to bind primarily through E79 and E82 interacting with KEAP1 residues S363, R380, R415, R483, and S508. We identify fragment hits and obtain X-ray complex structures for three fragments via crystal soaking using a new crystal form of KEAP1. Combining these results provides a comprehensive and quantitative picture of the origins of binding energy at the interface. Our findings additionally reveal non-native interactions that might be exploited in the design of uncharged synthetic ligands to occupy the same site on KEAP1 that has evolved to bind the highly charged DEETGE binding loop of Nrf2. These include π-stacking with KEAP1 Y525 and interactions at an FTMapidentified hot spot deep in the binding site. Finally, we discuss how the complementary information provided by alaninescanning mutagenesis, fragment screening, and computational hot spot mapping can be integrated to more comprehensively evaluate PPI druggability.
Phosphoglycosyltransferases (PGTs) catalyze the transfer of a C1′-phosphosugar from a soluble sugar nucleotide diphosphate to a polyprenol-phosphate. These enzymes act at the membrane interface, forming the first membrane-associated intermediates in the biosynthesis of cell-surface glycans and glycoconjugates including glycoproteins, glycolipids and the peptidoglycan in bacteria. PGTs vary greatly in both in their membrane topologies and their substrate preferences. PGTs, such as MraY and WecA, are polytopic, while other families of uniquely prokaryotic enzymes have only a single predicted transmembrane helix. PglC, a PGT involved in the biosynthesis of N-linked glycoproteins in the enteropathogen C. jejuni, is representative of one of the structurally most simple members of the diverse family of small bacterial PGT enzymes. Herein, we apply bioinformatics and covariance-weighted distance constraints in geometry- and homology-based model building, together with mutational analysis to investigate monotopic PGTs. The pool of 15,000 sequences that are analyzed include the PglC-like enzymes, as well as sequences from two other related PGTs that contain a “PglC-like” domain embedded in their larger structures (namely, the bifunctional PglB family, typified by PglB from N. gonorrheae and WbaP-like enzymes, typified by WbaP from S. enterica). Including these two sub-families of PGTs in the analysis highlights key residues conserved across all three families of small bacterial PGTs. Mutagenesis analysis of these conserved residues provides further information on the essentiality of many of these residues in catalysis. Construction of a structural model of the cytosolic globular domain utilizing three-dimensional distance constraints, provided by conservation covariance analysis, provides additional insight into the catalytic core of these families of small bacterial PGT enzymes.
The sea urchin larval skeleton offers a simple model for formation of developmental patterns. The calcium carbonate skeleton is secreted by primary mesenchyme cells (PMCs) in response to largely unknown patterning cues expressed by the ectoderm. To discover novel ectodermal cues, we performed an unbiased RNA-Seq-based screen and functionally tested candidates; we thereby identified several novel skeletal patterning cues. Among these, we show that SLC26a2/7 is a ventrally expressed sulfate transporter that promotes a ventral accumulation of sulfated proteoglycans, which is required for ventral PMC positioning and skeletal patterning. We show that the effects of SLC perturbation are mimicked by manipulation of either external sulfate levels or proteoglycan sulfation. These results identify novel skeletal patterning genes and demonstrate that ventral proteoglycan sulfation serves as a positional cue for sea urchin skeletal patterning.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.