Antimicrobial peptides (AMPs) are promising candidates in the fight against multidrug-resistant pathogens owing to AMPs’ broad range of activities and low toxicity. Nonetheless, identification of AMPs through wet-lab experiments is still expensive and time consuming. Here, we propose an accurate computational method for AMP prediction by the random forest algorithm. The prediction model is based on the distribution patterns of amino acid properties along the sequence. Using our collection of large and diverse sets of AMP and non-AMP data (3268 and 166791 sequences, respectively), we evaluated 19 random forest classifiers with different positive:negative data ratios by 10-fold cross-validation. Our optimal model, AmPEP with the 1:3 data ratio, showed high accuracy (96%), Matthew’s correlation coefficient (MCC) of 0.9, area under the receiver operating characteristic curve (AUC-ROC) of 0.99, and the Kappa statistic of 0.9. Descriptor analysis of AMP/non-AMP distributions by means of Pearson correlation coefficients revealed that reduced feature sets (from a full-featured set of 105 to a minimal-feature set of 23) can result in comparable performance in all respects except for some reductions in precision. Furthermore, AmPEP outperformed existing methods in terms of accuracy, MCC, and AUC-ROC when tested on benchmark datasets.
Antimicrobial peptides (AMPs) are a valuable source of antimicrobial agents and a potential solution to the multi-drug resistance problem. In particular, short-length AMPs have been shown to have enhanced antimicrobial activities, higher stability, and lower toxicity to human cells. We present a shortlength (%30 aa) AMP prediction method, Deep-AmPEP30, developed based on an optimal feature set of PseKRAAC reduced amino acids composition and convolutional neural network. On a balanced benchmark dataset of 188 samples, Deep-AmPEP30 yields an improved performance of 77% in accuracy, 85% in the area under the receiver operating characteristic curve (AUC-ROC), and 85% in area under the precisionrecall curve (AUC-PR) over existing machine learning-based methods. To demonstrate its power, we screened the genome sequence of Candida glabrata-a gut commensal fungus expected to interact with and/or inhibit other microbes in the gut-for potential AMPs and identified a peptide of 20 aa (P3, FWELWKFLKSLWSIFPRRRP) with strong anti-bacteria activity against Bacillus subtilis and Vibrio parahaemolyticus. The potency of the peptide is remarkably comparable to that of ampicillin. Therefore, Deep-AmPEP30 is a promising prediction tool to identify short-length AMPs from genomic sequences for drug discovery. Our method is available at https://cbbio.cis.um.edu.mo/AxPEP for both individual sequence prediction and genome screening for AMPs.
In human cells, one-third of all polypeptides enter the secretory pathway at the endoplasmic reticulum (ER). The specificity and efficiency of this process are guaranteed by targeting of mRNAs and/or polypeptides to the ER membrane. Cytosolic SRP and its receptor in the ER membrane facilitate the cotranslational targeting of most ribosome-nascent precursor polypeptide chain (RNC) complexes together with the respective mRNAs to the Sec61 complex in the ER membrane. Alternatively, fully synthesized precursor polypeptides are targeted to the ER membrane post-translationally by either the TRC, SND, or PEX19/3 pathway. Furthermore, there is targeting of mRNAs to the ER membrane, which does not involve SRP but involves mRNA- or RNC-binding proteins on the ER surface, such as RRBP1 or KTN1. Traditionally, the targeting reactions were studied in cell-free or cellular assays, which focus on a single precursor polypeptide and allow the conclusion of whether a certain precursor can use a certain pathway. Recently, cellular approaches such as proximity-based ribosome profiling or quantitative proteomics were employed to address the question of which precursors use certain pathways under physiological conditions. Here, we combined siRNA-mediated depletion of putative mRNA receptors in HeLa cells with label-free quantitative proteomics and differential protein abundance analysis to characterize RRBP1- or KTN1-involving precursors and to identify possible genetic interactions between the various targeting pathways. Furthermore, we discuss the possible implications on the so-called TIGER domains and critically discuss the pros and cons of this experimental approach.
The Mycobacterium ulcerans exotoxin, mycolactone, is an inhibitor of co-translational translocation via the Sec61 complex. Mycolactone has previously been shown to bind to, and alter the structure of the major translocon subunit Sec61α, and change its interaction with ribosome nascent chain complexes. In addition to its function in protein translocation into the ER, Sec61 also plays a key role in cellular Ca2+ homeostasis, acting as a leak channel between the endoplasmic reticulum (ER) and cytosol. Here, we have analysed the effect of mycolactone on cytosolic and ER Ca2+ levels using compartment-specific sensors. We also used molecular docking analysis to explore potential interaction sites for mycolactone on translocons in various states. These results show that mycolactone enhances the leak of Ca2+ ions via the Sec61 translocon, resulting in a slow but substantial depletion of ER Ca2+. This leak was dependent on mycolactone binding to Sec61α because resistance mutations in this protein completely ablated the increase. Molecular docking supports the existence of a mycolactone-binding transient inhibited state preceding translocation and suggests mycolactone may also bind Sec61α in its idle state. We propose that delayed ribosomal release after translation termination and/or translocon ‘breathing' during rapid transitions between the idle and intermediate-inhibited states allow for transient Ca2+ leak, and mycolactone's stabilisation of the latter underpins the phenotype observed.
Inference of molecular function of proteins is the fundamental task in the quest for understanding cellular processes. The task is getting increasingly difficult with thousands of new proteins discovered each day. The difficulty arises primarily due to lack of high-throughput experimental technique for assessing protein molecular function, a lacunae that computational approaches are trying hard to fill. The latter too faces a major bottleneck in absence of clear evidence based on evolutionary information. Here we propose a de novo approach to annotate protein molecular function through structural dynamics match for a pair of segments from two dissimilar proteins, which may share even <10% sequence identity. To screen these matches, corresponding 1 µs coarse-grained (CG) molecular dynamics trajectories were used to compute normalized root-mean-square-fluctuation graphs and select mobile segments, which were, thereafter, matched for all pairs using unweighted three-dimensional autocorrelation vectors. Our in-house custom-built forcefield (FF), extensively validated against dynamics information obtained from experimental nuclear magnetic resonance data, was specifically used to generate the CG dynamics trajectories. The test for correspondence of dynamics-signature of protein segments and function revealed 87% true positive rate and 93.5% true negative rate, on a dataset of 60 experimentally validated proteins, including moonlighting proteins and those with novel functional motifs. A random test against 315 unique fold/function proteins for a negative test gave >99% true recall. A blind prediction on a novel protein appears consistent with additional evidences retrieved therein. This is the first proof-of-principle of generalized use of structural dynamics for inferring protein molecular function leveraging our custom-made CG FF, useful to all.
The Sec complex catalyzes the translocation of proteins of the secretory pathway into the endoplasmic reticulum and the integration of membrane proteins into the endoplasmic reticulum membrane. Some substrate peptides require the presence and involvement of accessory proteins such as Sec63. Recently, a structure of the Sec complex from Saccharomyces cerevisiae, consisting of the Sec61 channel and the Sec62, Sec63, Sec71 and Sec72 proteins was determined by cryo-electron microscopy (cryo-EM). Here, we show by co-precipitation that the accessory membrane protein Sec62 is not required for formation of stable Sec63-Sec61 contacts. Molecular dynamics simulations started from the cryo-EM conformation of Sec61 bound to Sec63 and of unbound Sec61 revealed how Sec63 affects the conformation of Sec61 lateral gate, plug, pore region and pore ring diameter via three intermolecular contact regions. Molecular docking of SRP-dependent vs. SRP-independent peptide chains into the Sec61 channel showed that the pore regions affected by presence/absence of Sec63 play a crucial role in positioning the signal anchors of SRP-dependent substrates nearby the lateral gate.
Reasonable all-atom or united-atom biomolecular force fields have been developed to represent the properties of proteins and lipid membranes in molecular dynamics simulations. However, since they have not been parametrized for self-assembled monolayers (SAMs), their utility in simulating SAMs and protein−SAM systems has not been confirmed. Here, we compared six popular biomolecular force fields, Lipid14, GAFF, L-OPLS, CHARMM36, Slipids, and GROMOS54a7, to simulate alkanethiol SAMs of short to long chains (C10−C18). Our results show that none of these force fields reproduce the chain length dependence of the tilt angle, and twist angle. Although the droplet contact angles on SAMs are well represented by all force fields, only GAFF and Lipid14 yield phase transition temperatures that are reasonably close to the experimental values. Overall, our comprehensive comparison suggests that GAFF and Lipid14 are better choices for SAM simulations; further improvements in the force field parameters for SAMs are required. (152 words)
The human Sec61 complex is a widely distributed and abundant molecular machine. It resides in the membrane of the endoplasmic reticulum to channel two types of cargo: protein substrates and calcium ions. The SEC61A1 gene encodes for the pore-forming Sec61α subunit of the Sec61 complex. Despite their ubiquitous expression, the idiopathic SEC61A1 missense mutations p.V67G and p.T185A trigger a localized disease pattern diagnosed as autosomal dominant tubulointerstitial kidney disease (ADTKD–SEC61A1). Using cellular disease models for ADTKD–SEC61A1, we identified an impaired protein transport of the renal secretory protein renin and a reduced abundance of regulatory calcium transporters, including SERCA2. Treatment with the molecular chaperone phenylbutyrate reversed the defective protein transport of renin and the imbalanced calcium homeostasis. Signal peptide substitution experiments pointed at targeting sequences as the cause for the substrate-specific impairment of protein transport in the presence of the V67G or T185A mutations. Similarly, dominant mutations in the signal peptide of renin also cause ADTKD and point to impaired transport of this renal hormone as important pathogenic feature for ADTKD–SEC61A1 patients as well.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.