The number of protein and peptide structures included in Protein Data Bank (PDB) and Gen Bank without functional annotation has increased. Consequently, there is a high demand for theoretical models to predict these functions. Here, we trained and validated, with an external set, a Markov Chain Model (MCM) that classifies proteins by their possible mechanism of action according to Enzyme Classification (EC) number. The methodology proposed is essentially new, and enables prediction of all EC classes with a single equation without the need for an equation for each class or nonlinear models with multiple outputs. In addition, the model may be used to predict whether one peptide presents a positive or negative contribution of the activity of the same EC class. The model predicts the first EC number for 106 out of 151 (70.2%) oxidoreductases, 178/178 (100%) transferases, 223/223 (100%) hydrolases, 64/85 (75.3%) lyases, 74/74 (100%) isomerases, and 100/100 (100%) ligases, as well as 745/811 (91.9%) nonenzymes. It is important to underline that this method may help us predict new enzyme proteins or select peptide candidates that improve enzyme activity, which may be of interest for the prediction of new drugs or drug targets. To illustrate the model's application, we report the 2D-Electrophoresis (2DE) isolation from Leishmania infantum as well as MADLI TOF Mass Spectra characterization and theoretical study of the Peptide Mass Fingerprints (PMFs) of a new protein sequence. The theoretical study focused on MASCOT, BLAST alignment, and alignment-free QSAR prediction of the contribution of 29 peptides found in the PMF of the new protein to specific enzyme action. This combined strategy may be used to identify and predict peptides of prokaryote and eukaryote parasites and their hosts as well as other superior organisms, which may be of interest in drug development or target identification.
In this communication we carry out an in-depth review of a very versatile QSPR-like method. The method name is MARCH-INSIDE (MARkov CHains Ivariants for Network Selection and DEsign) and is a simple but efficient computational approach to the study of QSPR-like problems in biomedical sciences. The method uses the theory of Markov Chains to generate parameters that numerically describe the structure of a system. This approach generates two principal types of parameters Stochastic Topological Indices (sto-TIs). The use of these parameters allows the rapid collection, annotation, retrieval, comparison and mining structures of molecular, macromolecular, supramolecular, and non-molecular systems within large databases. Here, we review and comment by the first time on the several applications of MARCH-INSIDE to predict drugs ADMET, Activity, Metabolizing Enzymes, and Toxico-Proteomics biomarkers discovery. The MARCH-INSIDE models reviewed are: a) drug-tissue distribution profiles, b) assembling drug-tissue complex networks, c) multi-target models for anti-parasite/anti-microbial activity, c) assembling drug-target networks, d) drug toxicity and side effects, e) web-server for drug metabolizing enzymes, f) models in drugs toxico-proteomics. We close the review with some legal remarks related to the use of this class of QSPR-like models.
Quantitative Structure-Activity Relationship (QSAR) models have been used in Pharmaceutical design and Medicinal Chemistry for the discovery of anti-parasite drugs. QSAR models predict biological activity using as input different types of structural parameters of molecules. Topological Indices (TIs) are a very interesting class of these parameters. We can derive TIs from graph representations based on only nodes (atoms) and edges (chemical bonds). TIs are not time-consuming in terms of computational resources because they depend only on atom-atom connectivity information. This information expressed in the molecular graphs can be tabulated in the form of adjacency matrices easy to manipulate with computers. Consequently, TIs allow the rapid collection, annotation, retrieval, comparison and mining of molecular structures within large databases. The interest in TIs has exploded because we can use them to describe also macromolecular and macroscopic systems represented by complex networks of interactions (links) between the different parts of a system (nodes) such as: drug-target, protein-protein, metabolic, host-parasite, brain cortex, parasite disease spreading, Internet, or social networks. In this work, we review and comment on the following topics related to the use of TIs in anti-parasite drugs and target discovery. The first topic reviewed was: Topological Indices and QSAR for antiparasitic drugs. This topic included: Theoretical Background, QSAR for anti-malaria drugs, QSAR for anti-Toxoplasma drugs. The second topic was: TOMO-COMD approach to QSAR of antiparasitic drugs. We included in this topic: TOMO-COMD theoretical background and TOMO-COMD models for antihelmintic activity, Trichomonas, anti-malarials, anti-trypanosome compounds. The third section was inserted to discuss Topological Indices in the context of Complex Networks. The last section is devoted to the MARCH-INSIDE approach to QSAR of antiparasitic drugs and targets. This begins with a theoretical background for drugs and parameters for proteins. Next, we reviewed MARCH-INSIDE models for Pharmaceutical Design of antiparasitic drugs including: flukicidal drugs and anti-coccidial drugs. We close MARCH-NSIDE topic with a review of multi-target QSAR of antiparasitic drugs, MARCH-INSIDE assembly of complex networks of antiparasitic drugs. We closed the MARCH-INSIDE section discussing the prediction of proteins in parasites and MARCH-INSIDE web-servers for Protein-Protein interactions in parasites: Plasmod-PPI and Trypano-PPI web-servers. We closed this revision with an important section devoted to review some legal issues related to QSAR models.
Perturbation methods add variation terms to a known experimental solution of one problem to approach a solution for a related problem without known exact solution. One problem of this type in immunology is the prediction of the possible action of epitope of one peptide after a perturbation or variation in the structure of a known peptide and/or other boundary conditions (host organism, biological process, and experimental assay). However, to the best of our knowledge, there are no reports of general-purpose perturbation models to solve this problem. In a recent work, we introduced a new quantitative structure-property relationship theory for the study of perturbations in complex biomolecular systems. In this work, we developed the first model able to classify more than 200,000 cases of perturbations with accuracy, sensitivity, and specificity >90% both in training and validation series. The perturbations include structural changes in >50000 peptides determined in experimental assays with boundary conditions involving >500 source organisms, >50 host organisms, >10 biological process, and >30 experimental techniques. The model may be useful for the prediction of new epitopes or the optimization of known peptides towards computational vaccine design.
A c c e p t e d m a n u s c r i p t2 Abstract Several graph representations have been introduced for different data in theoretical biology. For instance, Complex Networks based on Graph theory are used to represent the structure and/or dynamics of different large biological systems such as protein-protein interaction networks. In addition, Randic, Liao, Nandy, Basak, and many others developed some special types of graph-based representations. This special type of graph includes geometrical constrains to node positioning in space and adopts final geometrical shapes that resemble lattice-like patterns. Lattice networks have been used to visually depict DNA and protein sequences but they are very flexible. However, despite the proved efficacy of new Lattice-like graph/networks to represent diverse systems, most works focus on only one specific type of biological data. This work proposes a generalized type of lattice and illustrates how to use it in order to represent and compare biological data from different sources. We exemplify the following cases: Protein sequence; Mass Spectra (MS) of protein Peptide Mass Fingerprints (PMF); Molecular Dynamic Trajectory (MDTs) from structural studies; mRNA Microarray data; Single Nucleotide Polymorphisms (SNPs); 1D or 2D-Electrophoresis study of protein Polymorphisms and Protein-research patent and/or copyright information. We used data available from public sources for some examples but for other, we used experimental results reported herein for the first time. This work may break new ground for the application of graph theory in theoretical biology and other areas of biomedical sciences.Keywords: Graph theory; Complex Networks; Proteomics; Mass Spectrometry; Leishmaniosis; 2D Electrophoresis; Parasite population Polymorphism; Single Nucleotide Polymorphism; Schizophrenia; Microarray; Cancer; Patents & Copyright studies. A c c e p t e d m a n u s c r i p t 3 IntroductionSeveral graph representations have been introduced for different data in theoretical biology. For instance, Complex Networks based on Graph theory are used to represent the structure and/or dynamics of different large biological systems such as protein-protein interaction networks. Complex networks are made up of nodes and edges/arcs (node-node connections or links). Drugs, genes, RNAs, proteins, organisms, brain cortex regions, diseases, patients or environmental systems may play the role of nodes. In general, the edges represent similarity/dissimilarity relationships between the nodes. In Complex Networks, both nodes and edges are placed generally in space without any geometrical constrains; nodes do not need spatial coordinates and edges have not a specific length or shape (Barabasi and Oltvai, 2004;Boccaletti et al., 2006;Estrada, 2006). In addition, Randic, Nandy, Basak, Liao, and many others developed some special types of graph-based representations. This special type of graph includes geometrical constrains to node positioning in space and sometimes adopts final geometrical shapes that resemble lattice...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.