Protein phosphorylation is one of the essential posttranslation modifications playing a vital role in the regulation of many fundamental cellular processes. We propose a LightGBM‐based computational approach that uses evolutionary, geometric, sequence environment, and amino acid‐specific features to decipher phosphate binding sites from a protein sequence. Our method, while compared with other existing methods on 2429 protein sequences taken from standard Phospho.ELM (P.ELM) benchmark data set featuring 11 organisms reports a higher F1 score = 0.504 (harmonic mean of the precision and recall) and ROC AUC = 0.836 (area under the curve of the receiver operating characteristics). The computation time of our proposed approach is much less than that of the recently developed deep learning‐based framework. Structural analysis on selected protein sequences informs that our prediction is the superset of the phosphorylation sites, as mentioned in P.ELM data set. The foundation of our scheme is manual feature engineering and a decision tree‐based classification. Hence, it is intuitive, and one can interpret the final tree as a set of rules resulting in a deeper understanding of the relationships between biophysical features and phosphorylation sites. Our innovative problem transformation method permits more control over precision and recall as is demonstrated by the fact that if we incorporate output probability of the existing deep learning framework as an additional feature, then our prediction improves (F1 score = 0.546; ROC AUC = 0.849). The implementation of our method can be accessed at http://cse.iitkgp.ac.in/~pralay/resources/PPSBoost/ and is mirrored at https://cosmos.iitkgp.ac.in/PPSBoost.
The first step to predict the outcome of a chemical reaction is to classify existing chemical reactions, on the basis of which possible outcome of unknown reaction can be predicted. There are two approaches for classification of chemical reactions: Model-Driven and Data-Driven. In model-driven approach, chemical structures are usually stored in a computer as molecular graphs. Such graphs can also be represented as matrices. The most preferred matrix representation to store molecular graph is Bond-Electron matrix (BE-matrix). The Reaction matrix (R-matrix) of a chemical reaction can be obtained from the BE-matrices of educts and products was shown by Ugi and his co-workers. Ugi's Scheme comprises of 30 reaction classes according to which reactions can be classified, but in spite of such reaction classes there were several reactions which could not be classified. About 4000 reactions were studied in this work from The Chemical Thesaurus (a chemical reaction database) and accordingly 24 new classes have emerged which led to the extension of Ugi's Scheme. An efficient algorithm based on the extended Ugi's scheme have been developed for classification of chemical reactions. Reaction matrices being symmetric, matrix implementation of extended Ugi's scheme using conventional upper/lower tri-angular matrix is of O(n2) in terms of space complexity. Time complexity of similar matrix implementation is O(n2) in worst case. The authors' proposed algorithm uses two fixed size look-up tables in a novel way and requires constant space complexity. Worst case time complexity of their algorithm although still O(n2) but it outperforms conventional matrix implementation when number of atoms or components in the chemical reaction is 4 or more.
Flagellar rotation regulates the phenomenon of chemotaxis in bacteria. The interaction between the stator unit and the rotor unit of the flagellar motors is responsible for switching the direction of bacterial flagellar rotation. However, the molecular interaction mechanism between the stator (MotA/MotB) and the rotor (FliG/FliM/FliN) proteins for the flagellar rotational direction switching was not very clear. To address this, the asymmetry in the copies of FliG, FliM, and FliN molecules was resolved by reconstructing the switch complex using a modeled rotor unit that fulfills the experimentally available geometric constraints. The diameter of our assembled switch complex supported the existing literature. Experimental evidence and the conformational spread model validates our constructed switch complex. Subsequently, normal mode analysis (NMA) on these constructed protomer units revealed that the most fluctuating molecule in the rotor unit is FliG, which interacts with the bacterial stator through its C-terminal domain. NMA also facilitates our understanding of the reorientation mechanism of FliG between the two states of its flagellar rotation, i.e., counter-clockwise to clockwise and vice versa. Our observations regarding speed regulation, the gap between rotor and stator, and the flagellar switching due to the activity of cytoplasmic proteins, indicate that the bacterial flagellar motor uses the same mechanism as that of an electric motor. Graphical abstract Molecular mechanism of the bacterial flagellar switch.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.