Xuezheng Fu scite author profile

As a commonly used technique in data preprocessing, feature selection selects a subset of informative attributes or variables to build models describing data. By removing redundant and irrelevant or noise features, feature selection can improve the predictive accuracy and the comprehensibility of the predictors or classifiers. Many feature selection algorithms with different selection criteria has been introduced by researchers. However, it is discovered that no single criterion is best for all applications. In this paper, we propose a framework based on a genetic algorithm (GA) for feature subset selection that combines various existing feature selection methods. The advantages of this approach include the ability to accommodate multiple feature selection criteria and find small subsets of features that perform well for a particular inductive learning algorithm of interest to build the classifier. We conducted experiments using three data sets and three existing feature selection methods. The experimental results demonstrate that our approach is a robust and effective approach to find subsets of features with higher classification accuracy and/or smaller size compared to each individual feature selection algorithm.

show abstract

Content-based Image Retrieval Using Gabor-Zernike Features

Harrison

et al. 2006

View full text Add to dashboard Cite

A Hybrid Feature Selection Approach for Microarray Gene Expression Data

Tan

Wang

et al. 2006

View full text Add to dashboard Cite

Abstract. Due to the huge number of genes and comparatively small number of samples from microarray gene expression data, accurate classification of diseases becomes challenging. Feature selection techniques can improve the classification accuracy by removing irrelevant and redundant genes. However, the performance of different feature selection algorithms based on different theoretic arguments varies even when they are applied to the same data set. In this paper, we propose a hybrid approach to combine useful outcomes from different feature selection methods through a genetic algorithm. The experimental results demonstrate that our approach can achieve better classification accuracy with a smaller gene subset than each individual feature selection algorithm does.

show abstract

Improving Feature Subset Selection Using a Genetic Algorithm for Microarray Gene Expression Data

Tan

Zhang

et al.

View full text Add to dashboard Cite

Microarray data usually contains a huge number of genes (features) and a comparatively small number of samples, which make accurate classification or prediction of diseases challenging. Feature selection techniques can help us identify important and irrelevant (unimportant) features by applying certain selection criteria. However, different feature selection algorithms based on various theoretical arguments often produce different results when applied to the same data set. This makes selecting an optimal or near optimal feature subset for a data set difficult. In this paper, we propose using a genetic algorithm to improve feature subset selection by combining valuable outcomes from multiple feature selection methods. The goal of our genetic algorithm is to achieve a balance between the classification accuracy and the size of the feature subsets selected. The advantages of this approach include the ability to accommodate different feature selection criteria and find small subsets of features that perform well for a particular inductive learning algorithm of interest to build the classifier. The experimental results demonstrate that our approach can find subsets of features with higher classification accuracy and/or smaller size compared with each individual feature selection algorithm.

show abstract

RNA Pseudoknot Prediction Using Term Rewriting

Wang

Harrison

et al.

View full text Add to dashboard Cite

Multi-Level Discrete Cosine Transform for Content-Based Image Retrieval by Support Vector Machines

Chen

et al. 2007

View full text Add to dashboard Cite

Pluggable Application Server Framework

Wang

Tan²,

Sabnis³

et al. 2006

View full text Add to dashboard Cite

Building a system based on variants of disparate individual components/programs is usually a challenging task. The components/programs are not designed to communicate with each other but the whole system construction does require a seamless collaboration among them. In this paper, targeting at protein structure prediction, a pluggable application server framework is presented. The framework is capable of combining various existing programs into an efficient unit and the design is devoted to provide a model which is able to integrate heterogeneous components/programs into the system quickly without modifying their codes. Based on the model, different components can be plugged into the system with easy configuration, which would lead to a self-configurable and adaptive system. A protein structure prediction server implementation was developed by applying the design model and the real implementation emphasizes the efficiency and simplicity of the system construction. The method and model are generic and can be applied to other system design as well.

show abstract

A rule-based approach for RNA pseudoknot prediction

Fu¹,

Wang²,

Harrison³

et al. 2008

IJDMB

View full text Add to dashboard Cite

RNA plays a critical role in mediating every step of cellular information transfer from genes to functional proteins. Pseudoknots are functionally important and widely occurring structural motifs found in all types of RNA. Therefore predicting their structures is an important problem. In this paper, we present a new RNA pseudoknot structure prediction method based on term rewriting. The method is implemented using the Mfold RNA/DNA folding package and the term rewriting language Maude. In our method, RNA structures are treated as terms and rules are discovered for predicting pseudoknots. Our method was tested on 211 pseudoknots in PseudoBase and achieves an average accuracy of 74.085% compared to the experimentally determined structure. In fact, most pseudoknots discovered by our method achieve an accuracy of above 90%. These results indicate that term rewriting has a broad potential in RNA applications ranging from prediction of pseudoknots to discovery of higher level RNA structures involving complex RNA tertiary interactions.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Xuezheng Fu

A genetic algorithm-based method for feature subset selection

Content-based Image Retrieval Using Gabor-Zernike Features

A Hybrid Feature Selection Approach for Microarray Gene Expression Data

Improving Feature Subset Selection Using a Genetic Algorithm for Microarray Gene Expression Data

RNA Pseudoknot Prediction Using Term Rewriting

Multi-Level Discrete Cosine Transform for Content-Based Image Retrieval by Support Vector Machines

Pluggable Application Server Framework

A rule-based approach for RNA pseudoknot prediction

Contact Info

Product

Resources

About