The function of a protein is closely correlated with its subcellular location. With the rapid increase in new protein sequences entering into data banks, we are confronted with a challenge: is it possible to utilize a bioinformatic approach to help expedite the determination of protein subcellular locations? To explore this problem, proteins were classified, according to their subcellular locations, into the following 12 groups: (1) chloroplast, (2) cytoplasm, (3) cytoskeleton, (4) endoplasmic reticulum, (5) extracell, (6) Golgi apparatus, (7) lysosome, (8) mitochondria, (9) nucleus, (10) peroxisome, (11) plasma membrane and (12) vacuole. Based on the classification scheme that has covered almost all the organelles and subcellular compartments in an animal or plant cell, a covariant discriminant algorithm was proposed to predict the subcellular location of a query protein according to its amino acid composition. Results obtained through self-consistency, jackknife and independent dataset tests indicated that the rates of correct prediction by the current algorithm are significantly higher than those by the existing methods. It is anticipated that the classification scheme and concept and also the prediction algorithm can expedite the functionality determination of new proteins, which can also be of use in the prioritization of genes and proteins identified by genomic efforts as potential molecular targets for drug design.
Membrane proteins are classified according to two different schemes. In scheme 1, they are discriminated among the following five types: (1) type I single-pass transmembrane, (2) type II single-pass transmembrane, (3) multipass trans-membrane, (4) lipid chain-anchored membrane, and (5) GPI-anchored membrane proteins. In scheme 2, they are discriminated among the following nine locations: (1) chloroplast, (2) endoplasmic reticu-lum, (3) Golgi apparatus, (4) lysosome, (5) mitochon-dria, (6) nucleus, (7) peroxisome, (8) plasma, and (9) vacuole. An algorithm is formulated for predicting the type or location of a given membrane protein based on its amino acid composition. The overall rates of correct prediction thus obtained by both self-consistency and jackknife tests, as well as by an independent dataset test, were around 76-81% for the classification of five types, and 66-70% for the classification of nine cellular locations. Furthermore , classification and prediction were also conducted between inner and outer membrane proteins ; the corresponding rates thus obtained were 88-91%. These results imply that the types of membrane proteins, as well as their cellular locations and other attributes, are closely correlated with their amino acid composition. It is anticipated that the classification schemes and prediction algorithm can expedite the functionality determination of new proteins. The concept and method can be also useful in the prioritization of genes and proteins identified by ge-nomics efforts as potential molecular targets for drug design. Proteins 1999;34:137-153. 1999 Wiley-Liss, Inc.
G-protein-coupled receptors play a key role in cellular signaling networks that regulate various physiological processes, such as vision, smell, taste, neurotransmission, secretion, inflammatory, immune responses, cellular metabolism, and cellular growth. These proteins are very important for understanding human physiology and disease. Many efforts in pharmaceutical research have been aimed at understanding their structure and function. Unfortunately, because they are difficult to crystallize and most of them will not dissolve in normal solvents, so far very few G-protein-coupled receptor structures have been determined. In contrast, more than 1000 G-protein-coupled receptor sequences are known, and many more are expected to become known soon. In view of the extremely unbalanced state, it would be very useful to develop a fast sequence-based method to identify their different types. This would no doubt have practical value for both basic research and drug discovery because the function or binding specificity of a G-protein coupled receptor is determined by the particular type it belongs to. To realize this, a statistical analysis has been performed for 566 G-protein-coupled receptors classified into seven different types. The results indicate that the types of G-protein-coupled receptors are predictable to a considerable accurate extent if a good training data set can be established for such a goal.
Membrane proteins are classified according to two different schemes. In scheme 1, they are discriminated among the following five types: (1) type I single-pass transmembrane, (2) type II single-pass transmembrane, (3) multipass transmembrane, (4) lipid chain-anchored membrane, and (5) GPI-anchored membrane proteins. In scheme 2, they are discriminated among the following nine locations: (1) chloroplast, (2) endoplasmic reticulum, (3) Golgi apparatus, (4) lysosome, (5) mitochondria, (6) nucleus, (7) peroxisome, (8) plasma, and (9) vacuole. An algorithm is formulated for predicting the type or location of a given membrane protein based on its amino acid composition. The overall rates of correct prediction thus obtained by both self-consistency and jackknife tests, as well as by an independent dataset test, were around 76-81% for the classification of five types, and 66-70% for the classification of nine cellular locations. Furthermore, classification and prediction were also conducted between inner and outer membrane proteins; the corresponding rates thus obtained were 88-91%. These results imply that the types of membrane proteins, as well as their cellular locations and other attributes, are closely correlated with their amino acid composition. It is anticipated that the classification schemes and prediction algorithm can expedite the functionality determination of new proteins. The concept and method can be also useful in the prioritization of genes and proteins identified by genomics efforts as potential molecular targets for drug design.
G-protein-coupled receptors have become a target in utilizing bioinformatics and genomics technology to facilitate drug discovery for psychiatric diseases. In this study the covariant-discriminant algorithm [Chou and Elrod (1999) Protein Eng., 12, 107-118] has been used to analyze the correlation between the types of G-protein-coupled receptors and the amino acid composition. It has been found that different types of G-protein-coupled receptors are quite closely correlated with the amino acid composition, implying that the types of G-protein-coupled receptors are predictable to a considerably accurate extent if a good training data set can be established for that purpose. The method derived here can be also used to do preliminary classification of orphan G-protein-coupled receptors. This will significantly expedite the process of identifying proper G-protein-coupled receptors for drug discovery.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.