SUMMARY The nearly 600 proteases in the human genome regulate a diversity of biological processes, including programmed cell death. Comprehensive characterization of protease signaling in complex biological samples is limited by available proteomic methods. We have developed a general approach for global identification of proteolytic cleavage sites based on enzymatic biotinylation of free protein N-termini and positive enrichment of corresponding N-terminal peptides. Using this method to study apoptosis, we have sequenced 333 caspase-like cleavage sites distributed among 292 protein substrates. These sites are generally not predicted by in vitro caspase substrate specificity, but can be used to predict other physiological caspase cleavage sites. Structural bioinformatic studies show that caspase cleavage sites often appear in surface accessible loops and even occasionally in helical regions. Strikingly, we also find that a disproportionate number of caspase substrates physically interact, suggesting that these dimeric proteases target protein complexes and networks to elicit apoptosis.
O-linked N-acetylglucosamine (O-GlcNAc) is a dynamic, reversible monosaccharide modifier of serine and threonine residues on intracellular protein domains. Crosstalk between O-GlcNAcylation and phosphorylation has been hypothesized. Here, we identified over 1750 and 16,500 sites of O-GlcNAcylation and phosphorylation from murine synaptosomes, respectively. In total, 135 (7%) of all O-GlcNAcylation sites were also found to be sites of phosphorylation. Although many proteins were extensively phosphorylated and minimally O-GlcNAcylated, proteins found to be extensively O-GlcNAcylated were almost always phosphorylated to a similar or greater extent, indicating the O-GlcNAcylation system is specifically targeting a subset of the proteome that is also phosphorylated. Both PTMs usually occur on disordered regions of protein structure, within which, the location of O-GlcNAcylation and phosphorylation is virtually random with respect to each other, suggesting that negative crosstalk at the structural level is not a common phenomenon. As a class, protein kinases are found to be more extensively O-GlcNAcylated than proteins in general, indicating the potential for crosstalk of phosphorylation with O
ModBase (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by ModPipe, an automated modeling pipeline that relies primarily on Modeller for fold assignment, sequence–structure alignment, model building and model assessment (http://salilab.org/modeller/). ModBase currently contains 10 355 444 reliable models for domains in 2 421 920 unique protein sequences. ModBase allows users to update comparative models on demand, and request modeling of additional sequences through an interface to the ModWeb modeling server (http://salilab.org/modweb). ModBase models are available through the ModBase interface as well as the Protein Model Portal (http://www.proteinmodelportal.org/). Recently developed associated resources include the SALIGN server for multiple sequence and structure alignment (http://salilab.org/salign), the ModEval server for predicting the accuracy of protein structure models (http://salilab.org/modeval), the PCSS server for predicting which peptides bind to a given protein (http://salilab.org/pcss) and the FoXS server for calculating and fitting Small Angle X-ray Scattering profiles (http://salilab.org/foxs).
MODBASE (http://salilab.org/modbase) is a database of annotated comparative protein structure models. The models are calculated by MODPIPE, an automated modeling pipeline that relies primarily on MODELLER for fold assignment, sequence–structure alignment, model building and model assessment (http:/salilab.org/modeller). MODBASE currently contains 5 152 695 reliable models for domains in 1 593 209 unique protein sequences; only models based on statistically significant alignments and/or models assessed to have the correct fold are included. MODBASE also allows users to calculate comparative models on demand, through an interface to the MODWEB modeling server (http://salilab.org/modweb). Other resources integrated with MODBASE include databases of multiple protein structure alignments (DBAli), structurally defined ligand binding sites (LIGBASE), predicted ligand binding sites (AnnoLyze), structurally defined binary domain interfaces (PIBASE) and annotated single nucleotide polymorphisms and somatic mutations found in human proteins (LS-SNP, LS-Mut). MODBASE models are also available through the Protein Model Portal (http://www.proteinmodelportal.org/).
Pathogens have evolved numerous strategies to infect their hosts, while hosts have evolved immune responses and other defenses to these foreign challenges. The vast majority of host-pathogen interactions involve protein-protein recognition, yet our current understanding of these interactions is limited. Here, we present and apply a computational whole-genome protocol that generates testable predictions of host-pathogen protein interactions. The protocol first scans the host and pathogen genomes for proteins with similarity to known protein complexes, then assesses these putative interactions, using structure if available, and, finally, filters the remaining interactions using biological context, such as the stage-specific expression of pathogen proteins and tissue expression of host proteins. The technique was applied to 10 pathogens, including species of Mycobacterium, apicomplexa, and kinetoplastida, responsible for ''neglected'' human diseases. The method was assessed by (1) comparison to a set of known host-pathogen interactions, (2) comparison to gene expression and essentiality data describing host and pathogen genes involved in infection, and (3) analysis of the functional properties of the human proteins predicted to interact with pathogen proteins, demonstrating an enrichment for functionally relevant host-pathogen interactions. We present several specific predictions that warrant experimental follow-up, including interactions from previously characterized mechanisms, such as cytoadhesion and protease inhibition, as well as suspected interactions in hypothesized networks, such as apoptotic pathways. Our computational method provides a means to mine whole-genome data and is complementary to experimental efforts in elucidating networks of host-pathogen protein interactions.
All predictions for both protease types are publically available at http://salilab.org/peptide. A web server is at the same site that allows a user to train new SVM models to make predictions for any protein that recognizes specific oligopeptide ligands.
Falcipain-2, a papain family cysteine protease of the malaria parasite Plasmodium falciparum, plays a key role in parasite hydrolysis of hemoglobin and is a potential chemotherapeutic target. As with many proteases, falcipain-2 is synthesized as a zymogen, and the prodomain inhibits activity of the mature enzyme. To investigate the mechanism of regulation of falcipain-2 by its prodomain, we expressed constructs encoding different portions of the prodomain and tested their ability to inhibit recombinant mature falcipain-2. We identified a C-terminal segment (Leu155–Asp243) of the prodomain, including two motifs (ERFNIN and GNFD) that are conserved in cathepsin L sub-family papain family proteases, as the mediator of prodomain inhibitory activity. Circular dichroism analysis showed that the prodomain including the C-terminal segment, but not constructs lacking this segment, was rich in secondary structure, suggesting that the segment plays a crucial role in protein folding. The falcipain-2 prodomain also efficiently inhibited other papain family proteases, including cathepsin K, cathepsin L, cathepsin B, and cruzain, but it did not inhibit cathepsin C or tested proteases of other classes. A structural model of pro-falcipain-2 was constructed by homology modeling based on crystallographic structures of mature falcipain-2, procathepsin K, procathepsin L, and procaricain, offering insights into the nature of the interaction between the prodomain and mature domain of falcipain-2 as well as into the broad specificity of inhibitory activity of the falcipain-2 prodomain.
Dengue virus (DENV) is a mosquito-borne flavivirus that poses a threat to public health, yet no antiviral drug is available. We performed a high-throughput phenotypic screen using the Novartis compound library and identified candidate chemical inhibitors of DENV. This chemical series was optimized to improve properties such as anti-DENV potency and solubility. The lead compound, NITD-688, showed strong potency against all four serotypes of DENV and demonstrated excellent oral efficacy in infected AG129 mice. There was a 1.44-log reduction in viremia when mice were treated orally at 30 milligrams per kilogram twice daily for 3 days starting at the time of infection. NITD-688 treatment also resulted in a 1.16-log reduction in viremia when mice were treated 48 hours after infection. Selection of resistance mutations and binding studies with recombinant proteins indicated that the nonstructural protein 4B is the target of NITD-688. Pharmacokinetic studies in rats and dogs showed a long elimination half-life and good oral bioavailability. Extensive in vitro safety profiling along with exploratory rat and dog toxicology studies showed that NITD-688 was well tolerated after 7-day repeat dosing, demonstrating that NITD-688 may be a promising preclinical candidate for the treatment of dengue.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.