Feature selection (FS), also known as attribute selection, is a process of selection of a subset of relevant features used in model construction. This process or method improves the classification accuracy by removing irrelevant and noisy features. FS is implemented using either batch learning or online learning. Currently, the FS methods are executed in batch learning. Nevertheless, these techniques take longer execution time and require larger storage space to process the entire dataset. Due to the lack of scalability, the batch learning process cannot be used for large data. In the present study, a scalable efficient Online Feature Selection (OFS) approach using the Sparse Gradient (SGr) technique was proposed to select the features from the dataset online. In this approach, the feature weights are proportionally decremented based on the threshold value, which results in attaining zeros for the insignificant features' weights. In order to demonstrate the efficiency of this approach, an extensive set of experiments was conducted using 13 real-world datasets that range from small to large size. The results of the experiments showed an improved classification accuracy of 15%, which is considered to be significant when compared with the existing methods.
Game Theory (GT) is the study of strategic decision making. By virtue of its importance, several GT based methodologies for Feature Selection (FS) are proposed in recent times. FS problem can be abstracted as a game by considering each feature as a player and their values as their strategies. Additionally, overall goal of the game is set to classify a data instance appropriately. Most of the existing GT based FS techniques are restricted to Zero Sum Games, Non-Zero Sum Games and Cooperative Games. The classical setting of assuming that all the details of players are known to all players cannot hold in many real-world problems. When the given features are independent, they cannot be treated alike and a characteristics based uncertainty persists among the features. This uncertainty is handled by none of the game forms used in the existing methods. Unlike the mentioned game techniques, Bayesian Games (BG) address the games with imperfect information. This paper investigate the FS problem in terms of BG and proposes a novel method to select the best features. The proposed BG based FS method is a filter type FS method and it starts with identifying Principle Features (PF) and proceeds to play global pairwise Bayesian games between those PF to obtain feature scores. Later, the features are ranked using these scores. In the final stage, a forward selection method with Support Vector Machine (SVM) is used to evaluate the classification performance of the ranked features and helps in the selection of the optimal set of features. Besides these, to improve the scalability of the proposed method MapReduce paradigm is exploited. In order to show the efficacy of the proposed method, experiments are carried out with seven real-world datasets from UCI and Statlog repositories. The promising results showed a significant improvement in the classification performance with fewer selected features than which is achieved using the existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.