Fule Liu scite author profile

With the avalanche of biological sequences generated in the post-genomic age, one of the most challenging problems in computational biology is how to effectively formulate the sequence of a biological sample (such as DNA, RNA or protein) with a discrete model or a vector that can effectively reflect its sequence pattern information or capture its key features concerned. Although several web servers and stand-alone tools were developed to address this problem, all these tools, however, can only handle one type of samples. Furthermore, the number of their built-in properties is limited, and hence it is often difficult for users to formulate the biological sequences according to their desired features or properties. In this article, with a much larger number of built-in properties, we are to propose a much more flexible web server called Pse-in-One (http://bioinformatics.hitsz.edu.cn/Pse-in-One/), which can, through its 28 different modes, generate nearly all the possible feature vectors for DNA, RNA and protein sequences. Particularly, it can also generate those feature vectors with the properties defined by users themselves. These feature vectors can be easily combined with machine-learning algorithms to develop computational predictors and analysis methods for various tasks in bioinformatics and system biology. It is anticipated that the Pse-in-One web server will become a very useful tool in computational proteomics, genomics, as well as biological sequence analysis. Moreover, to maximize users’ convenience, its stand-alone version can also be downloaded from http://bioinformatics.hitsz.edu.cn/Pse-in-One/download/, and directly run on Windows, Linux, Unix and Mac OS.

show abstract

Identification of Real MicroRNA Precursors with a Pseudo Structure Status Composition Approach

Liu

et al. 2015

View full text Add to dashboard Cite

Containing about 22 nucleotides, a micro RNA (abbreviated miRNA) is a small non-coding RNA molecule, functioning in transcriptional and post-transcriptional regulation of gene expression. The human genome may encode over 1000 miRNAs. Albeit poorly characterized, miRNAs are widely deemed as important regulators of biological processes. Aberrant expression of miRNAs has been observed in many cancers and other disease states, indicating they are deeply implicated with these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Particularly, with the avalanche of RNA sequences generated in the postgenomic age, it is highly desired to develop computational sequence-based methods in this regard. Here two new predictors, called “iMcRNA-PseSSC” and “iMcRNA-ExPseSSC”, were proposed for identifying the human pre-microRNAs by incorporating the global or long-range structure-order information using a way quite similar to the pseudo amino acid composition approach. Rigorous cross-validations on a much larger and more stringent newly constructed benchmark dataset showed that the two new predictors (accessible at http://bioinformatics.hitsz.edu.cn/iMcRNA/) outperformed or were highly comparable with the best existing predictors in this area.

show abstract

repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects

et al. 2014

View full text Add to dashboard Cite

show abstract

iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach

Liu

Fang

Liu

et al. 2015

Journal of Biomolecular Structure and Dynamics

128

View full text Add to dashboard Cite

A microRNA (miRNA) is a small non-coding RNA molecule, functioning in transcriptional and post-transcriptional regulation of gene expression. The human genome may encode over 1000 miRNAs. Albeit poorly characterized, miRNAs are widely deemed as important regulators of biological processes. Aberrant expression of miRNAs has been observed in many cancers and other disease states, indicating that they are deeply implicated with these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Particularly, with the avalanche of RNA sequences generated in the post-genomic age, it is highly desired to develop computational sequence-based methods for effectively identifying the human pre-miRNAs. Here, we propose a predictor called "iMiRNA-PseDPC", in which the RNA sequences are formulated by a novel feature vector called "pseudo distance-pair composition" (PseDPC) with 10 types of structure statuses. Rigorous cross-validations on a much larger and more stringent newly constructed benchmark data-set showed that our approach has remarkably outperformed the existing ones in either prediction accuracy or efficiency, indicating the new predictor is quite promising or at least may become a complementary tool to the existing predictors in this area. For the convenience of most experimental scientists, a user-friendly web server for the new predictor has been established at http://bioinformatics.hitsz.edu.cn/iMiRNA-PseDPC/, by which users can easily get their desired results without the need to go through the mathematical details. It is anticipated that the new predictor may become a useful high throughput tool for genome analysis particularly in dealing with large-scale data.

show abstract

repRNA: a web server for generating various feature vectors of RNA sequences

et al. 2015

View full text Add to dashboard Cite

With the rapid growth of RNA sequences generated in the postgenomic age, it is highly desired to develop a flexible method that can generate various kinds of vectors to represent these sequences by focusing on their different features. This is because nearly all the existing machine-learning methods, such as SVM (support vector machine) and KNN (k-nearest neighbor), can only handle vectors but not sequences. To meet the increasing demands and speed up the genome analyses, we have developed a new web server, called "representations of RNA sequences" (repRNA). Compared with the existing methods, repRNA is much more comprehensive, flexible and powerful, as reflected by the following facts: (1) it can generate 11 different modes of feature vectors for users to choose according to their investigation purposes; (2) it allows users to select the features from 22 built-in physicochemical properties and even those defined by users' own; (3) the resultant feature vectors and the secondary structures of the corresponding RNA sequences can be visualized. The repRNA web server is freely accessible to the public at http://bioinformatics.hitsz.edu.cn/repRNA/ .

show abstract

miRNA-dis: microRNA precursor identification based on distance structure status pairs

et al. 2015

View full text Add to dashboard Cite

MicroRNA precursor identification is an important task in bioinformatics. Support Vector Machine (SVM) is one of the most effective machine learning methods used in this field. The performance of SVM-based methods depends on the vector representations of RNAs. However, the discriminative power of the existing feature vectors is limited, and many methods lack an interpretable model for analysis of characteristic sequence features. Prior studies have demonstrated that sequence or structure order effects were relevant for discrimination, but little work has explored how to use this kind of information for human pre-microRNA identification. In this study, in order to incorporate the structure-order information into the prediction, a method called "miRNA-dis" was proposed, in which the feature vector was constructed by the occurrence frequency of the "distance structure status pair" or just the "distance-pair". Rigorous cross-validations on a much larger and more stringent newly constructed benchmark dataset showed that the miRNA-dis outperformed some state-of-the-art predictors in this area. Remarkably, miRNA-dis trained with human data can correctly predict 87.02% of the 4022 pre-miRNAs from 11 different species ranging from animals, plants and viruses. miRNA-dis would be a useful high throughput tool for large-scale analysis of microRNA precursors. In addition, the learnt model can be easily analyzed in terms of discriminative features, and some interesting patterns were discovered, which could reflect the characteristics of microRNAs. A user-friendly web-server of miRNA-dis was constructed, which is freely accessible to the public at the web-site on http://bioinformatics.hitsz.edu.cn/miRNA-dis/.

show abstract

Protein Binding Site Prediction by Combining Hidden Markov Support Vector Machine and Profile-Based Propensities

Liu

et al. 2014

The Scientific World Journal

View full text Add to dashboard Cite

Identification of protein binding sites is critical for studying the function of the proteins. In this paper, we proposed a method for protein binding site prediction, which combined the order profile propensities and hidden Markov support vector machine (HM-SVM). This method employed the sequential labeling technique to the field of protein binding site prediction. The input features of HM-SVM include the profile-based propensities, the Position-Specific Score Matrix (PSSM), and Accessible Surface Area (ASA). When tested on different data sets, the proposed method showed promising results, and outperformed some closely relative methods by more than 10% in terms of AUC.

show abstract

DF-1 cells prevent MG-HS infection through gga-miR-24-3p/RAP1B mediated decreased proliferation and increased apoptosis

Wang

Deng

Sun

et al. 2021

Research in Veterinary Science

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Fule Liu

Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences

Identification of Real MicroRNA Precursors with a Pseudo Structure Status Composition Approach

repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects

iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach

repRNA: a web server for generating various feature vectors of RNA sequences

miRNA-dis: microRNA precursor identification based on distance structure status pairs

Protein Binding Site Prediction by Combining Hidden Markov Support Vector Machine and Profile-Based Propensities

DF-1 cells prevent MG-HS infection through gga-miR-24-3p/RAP1B mediated decreased proliferation and increased apoptosis

Contact Info

Product

Resources

About