Xuan Xiao scite author profile

Predicting protein subcellular localization is an important and difficult problem, particularly when query proteins may have the multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular location predictor can only be used to deal with the single-location or “singleplex” proteins. Actually, multiple-location or “multiplex” proteins should not be ignored because they usually posses some unique biological functions worthy of our special notice. By introducing the “multi-labeled learning” and “accumulation-layer scale”, a new predictor, called iLoc-Euk, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Euk on a benchmark dataset of eukaryotic proteins classified into the following 22 location sites: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centriole, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole, where none of proteins included has pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Euk was 79%, which is significantly higher than that by any of the existing predictors that also have the capacity to deal with such a complicated and stringent system. As a user-friendly web-server, iLoc-Euk is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Euk. It is anticipated that iLoc-Euk may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development Also, its novel approach will further stimulate the development of predicting other protein attributes.

show abstract

iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model

Wei

et al. 2011

View full text Add to dashboard Cite

DNA-binding proteins play crucial roles in various cellular processes. Developing high throughput tools for rapidly and effectively identifying DNA-binding proteins is one of the major challenges in the field of genome annotation. Although many efforts have been made in this regard, further effort is needed to enhance the prediction power.By incorporating the features into the general form of pseudo amino acid composition that were extracted from protein sequences via the “grey model” and by adopting the random forest operation engine, we proposed a new predictor, called iDNA-Prot, for identifying uncharacterized proteins as DNA-binding proteins or non-DNA binding proteins based on their amino acid sequences information alone. The overall success rate by iDNA-Prot was 83.96% that was obtained via jackknife tests on a newly constructed stringent benchmark dataset in which none of the proteins included has pairwise sequence identity to any other in a same subset. In addition to achieving high success rate, the computational time for iDNA-Prot is remarkably shorter in comparison with the relevant existing predictors. Hence it is anticipated that iDNA-Prot may become a useful high throughput tool for large-scale analysis of DNA-binding proteins.As a user-friendly web-server, iDNA-Prot is freely accessible to the public at the web-site on http://icpr.jci.edu.cn/bioinfo/iDNA-Prot or http://www.jci-bioinfo.cn/iDNA-Prot. Moreover, for the convenience of the vast majority of experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results.

show abstract

The resurrection genome of Boea hygrometrica : A blueprint for survival of dehydration

Xiao

Zhang

et al. 2015

Proc. Natl. Acad. Sci. U.S.A.

131

162

View full text Add to dashboard Cite

"Drying without dying" is an essential trait in land plant evolution. Unraveling how a unique group of angiosperms, the Resurrection Plants, survive desiccation of their leaves and roots has been hampered by the lack of a foundational genome perspective. Here we report the ∼1,691-Mb sequenced genome of Boea hygrometrica, an important resurrection plant model. The sequence revealed evidence for two historical genome-wide duplication events, a compliment of 49,374 protein-coding genes, 29.15% of which are unique (orphan) to Boea and 20% of which (9,888) significantly respond to desiccation at the transcript level. Expansion of early light-inducible protein (ELIP) and 5S rRNA genes highlights the importance of the protection of the photosynthetic apparatus during drying and the rapid resumption of protein synthesis in the resurrection capability of Boea. Transcriptome analysis reveals extensive alternative splicing of transcripts and a focus on cellular protection strategies. The lack of desiccation tolerance-specific genome organizational features suggests the resurrection phenotype evolved mainly by an alteration in the control of dehydration response genes.

show abstract

iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites

2012

View full text Add to dashboard Cite

Although numerous efforts have been made for predicting the subcellular locations of proteins based on their sequence information, it still remains as a challenging problem, particularly when query proteins may have the multiplex character, i.e., they simultaneously exist, or move between, two or more different subcellular location sites. Most of the existing methods were established on the assumption: a protein has one, and only one, subcellular location. Actually, recent evidence has indicated an increasing number of human proteins having multiple subcellular locations. This kind of multiplex proteins should not be ignored because they may bear some special biological functions worthy of our attention. Based on the accumulation-label scale, a new predictor, called iLoc-Hum, was developed for identifying the subcellular localization of human proteins with both single and multiple location sites. As a demonstration, the jackknife cross-validation was performed with iLoc-Hum on a benchmark dataset of human proteins that covers the following 14 location sites: centrosome, cytoplasm, cytoskeleton, endoplasmic reticulum, endosome, extracellular, Golgi apparatus, lysosome, microsome, mitochondrion, nucleus, peroxisome, plasma membrane, and synapse, where some proteins belong to two, three or four locations but none has 25% or higher pairwise sequence identity to any other in the same subset. For such a complicated and stringent system, the overall success rate achieved by iLoc-Hum was 76%, which is remarkably higher than that by any of the existing predictors that also have the capacity to deal with this kind of system. Further comparisons were also made via two independent datasets; all indicated that the success rates by iLoc-Hum were even more significantly higher than its counterparts. As a user-friendly web-server, iLoc-Hum is freely accessible to the public at or . For the convenience of most experimental scientists, a step-by-step guide is provided on how to use the web-server to get the desired results by choosing either a straightforward submission or a batch submission, without the need to follow the complicated mathematical equations involved.

show abstract

iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset

Jia

Xiao

et al. 2016

Analytical Biochemistry

252

159

View full text Add to dashboard Cite

Phagocytosis Enhances Lysosomal and Bactericidal Properties by Activating the Transcription Factor TFEB

et al. 2016

View full text Add to dashboard Cite

Summary Macrophages internalize pathogens through phagocytosis, entrapping them into organelles called phagosomes. Phagosomes then fuse with lysosomes to mature into phagolysosomes, acquiring an acidic and hydrolytic lumen that kills the pathogens. During an ongoing infection, macrophages can internalize dozens of bacteria. Thus, we hypothesized that an initial round of phagocytosis might boost lysosome function and bactericidal ability to cope with subsequent rounds of phagocytosis. To test this hypothesis, we employed Fcγ receptor-mediated phagocytosis and endocytosis, which respectively internalize immunoglobulin G (IgG)-opsonized particles and polyvalent IgG immune complexes. We report that Fcγ receptor activation in macrophages enhanced lysosome-based proteolysis and killing of subsequently phagocytosed E. coli compared to naïve macrophages. Importantly, we show that Fcγ receptor activation caused nuclear translocation of TFEB, a transcription factor that boosts expression of lysosome genes. Indeed, Fc receptor activation was accompanied by increased expression of specific lysosomal proteins. Remarkably, TFEB silencing repressed the Fcγ receptor-mediated enhancements in degradation and bacterial killing. In addition, nuclear translocation of TFEB required phagosome completion and failed to occur in cells silenced for MCOLN1, a lysosomal Ca2+ channel, suggesting that lysosomal Ca2+ released during phagosome maturation activates TFEB. Finally, we demonstrated that non-opsonic phagocytosis of E. coli also enhanced lysosomal degradation in a TFEB-dependent manner suggesting that this phenomenon is not limited to Fcγ receptors. Overall, we show that macrophages become better killers after one round of phagocytosis and suggest that phagosomes and lysosomes are capable of bi-directional signaling.

show abstract

A Multi-Label Classifier for Predicting the Subcellular Localization of Gram-Negative Bacterial Proteins with Both Single and Multiple Sites

2011

View full text Add to dashboard Cite

Prediction of protein subcellular localization is a challenging problem, particularly when the system concerned contains both singleplex and multiplex proteins. In this paper, by introducing the “multi-label scale” and hybridizing the information of gene ontology with the sequential evolution information, a novel predictor called iLoc-Gneg is developed for predicting the subcellular localization of Gram-positive bacterial proteins with both single-location and multiple-location sites. For facilitating comparison, the same stringent benchmark dataset used to estimate the accuracy of Gneg-mPLoc was adopted to demonstrate the power of iLoc-Gneg. The dataset contains 1,392 Gram-negative bacterial proteins classified into the following eight locations: (1) cytoplasm, (2) extracellular, (3) fimbrium, (4) flagellum, (5) inner membrane, (6) nucleoid, (7) outer membrane, and (8) periplasm. Of the 1,392 proteins, 1,328 are each with only one subcellular location and the other 64 are each with two subcellular locations, but none of the proteins included has pairwise sequence identity to any other in a same subset (subcellular location). It was observed that the overall success rate by jackknife test on such a stringent benchmark dataset by iLoc-Gneg was over 91%, which is about 6% higher than that by Gneg-mPLoc. As a user-friendly web-server, iLoc-Gneg is freely accessible to the public at http://icpr.jci.edu.cn/bioinfo/iLoc-Gneg. Meanwhile, a step-by-step guide is provided on how to use the web-server to get the desired results. Furthermore, for the user's convenience, the iLoc-Gneg web-server also has the function to accept the batch job submission, which is not available in the existing version of Gneg-mPLoc web-server. It is anticipated that iLoc-Gneg may become a useful high throughput tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xuan Xiao

iAMP-2L: A two-level multi-label classifier for identifying antimicrobial peptides and their functional types

iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins

iDNA-Prot: Identification of DNA Binding Proteins Using Random Forest with Grey Model

The resurrection genome of Boea hygrometrica : A blueprint for survival of dehydration

iLoc-Hum: using the accumulation-label scale to predict subcellular locations of human proteins with both single and multiple sites

iSuc-PseOpt: Identifying lysine succinylation sites in proteins by incorporating sequence-coupling effects into pseudo components and optimizing imbalanced training dataset

Phagocytosis Enhances Lysosomal and Bactericidal Properties by Activating the Transcription Factor TFEB

A Multi-Label Classifier for Predicting the Subcellular Localization of Gram-Negative Bacterial Proteins with Both Single and Multiple Sites

Contact Info

Product

Resources

About