Manoranjan Dash scite author profile

Processing applications with a large number of dimensions has been a challenge lo the KDD community. Feature selection. an effective dimensionality reduction technique, is an essential pre-processing method to remove noisy features. In rhe literature there are only a few methods pmposed for feature selection for clustering. And, almost all of rhose methods are 'wrapper' techniques that require a clustering algorithm to evaluate the candidate feature subsets. The wrapper approach is largely unsuitable in real-world applications due to its heavy reliance on clustering algarirhms that require parameters such as number of clusters. and due ro lack of suitable clusrering criteria to evaluate clusrering in different subspaces. I n this paper we propose a %Iter' method that is independent of any clusrering algorithm. The proposed method is based on the observation that data with clusters has v e v different point-to-point distance histogram than that of data without clusters. Using this we propose an entropy measure thar is low ifdata has disrinct clusters and high otherwise. The entropy measure is suitable for selecting the most important subset of features because it is invariant with number of dimensions, and is affected only by the quality of clustering. Extensive performance evaluation over synthetic, benchmark, and real datasets shows its effectiveness.

show abstract

Markov blanket-embedded genetic algorithm for gene selection

Zhu

Ong

Dash

2007

Pattern Recognition

402

161

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Manoranjan Dash

Feature selection for classification

Feature Selection for Classification

Consistency-based search in feature selection

Feature selection for clustering - a filter solution

Markov blanket-embedded genetic algorithm for gene selection

Contact Info

Product

Resources

About