Abstract:Spam is defined as redundant and unwanted electronical letters, and nowadays, it has created many problems in business life such as occupying networks bandwidth and the space of user's mailbox. Due to these problems, much research has been carried out in this regard by using classification technique. The resent research show that feature selection can have positive effect on the efficiency of machine learning algorithm. Most algorithms try to present a data model depending on certain detection of small set of features. Unrelated features in the process of making model result in weak estimation and more computations. In this research it has been tried to evaluate spam detection in legal electronica letters, and their effect on several Machin learning algorithms through presenting a feature selection method based on genetic algorithm. Bayesian network and KNN classifiers have been taken into account in classification phase and spam base dataset is used.
Spam is a basic problem in electronic communications such as email systems in large scales and large number of weblogs and social networks. Due to the problems created by spams, much research has been carried out in this regard by using classification techniques. Redundant and high dimensional information are considered as a serious problem for these classification algorithms due to their high computation costs and using a memory. Reducing feature space results in representing an understandable model and using various methods. In this paper, the method of feature selection by using imperialist competitive algorithm has been presented. Decision tree and SVM classifications have been taken into account in classification phase. In order to prove the efficiency of this method, the results of evaluating data set of Spam Base have been compared with the algorithms proposed in this regard such as genetic algorithm. The results show that this method improves the efficiency of spam detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.