2014
DOI: 10.1155/2014/649260
|View full text |Cite
|
Sign up to set email alerts
|

An Ant Colony Optimization Based Feature Selection for Web Page Classification

Abstract: The increased popularity of the web has caused the inclusion of huge amount of information to the web, and as a result of this explosive information growth, automated web page classification systems are needed to improve search engines' performance. Web pages have a large number of features such as HTML/XML tags, URLs, hyperlinks, and text contents that should be considered during an automated classification process. The aim of this study is to reduce the number of features to be used to improve runtime and ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

2
24
0
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 45 publications
(28 citation statements)
references
References 39 publications
(45 reference statements)
2
24
0
1
Order By: Relevance
“…For the same reason as in Table 2, we excluded the Group Hunting section. [49] FA [56] ABC [33,41] 13 GSO [50] FA [53] FA [58,81] ACO [82] ACO [83] Based on the survey, we can state that Bio-Medical Engineering is the most used field for algorithms' testing or applying. Researchers in [58] proposed the Chaos FA algorithm for heart disease prediction; the Bacterial Memetic Algorithm based feature selection for surface EMG-based hand motion recognition in long-term use was suggested by [85].…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…For the same reason as in Table 2, we excluded the Group Hunting section. [49] FA [56] ABC [33,41] 13 GSO [50] FA [53] FA [58,81] ACO [82] ACO [83] Based on the survey, we can state that Bio-Medical Engineering is the most used field for algorithms' testing or applying. Researchers in [58] proposed the Chaos FA algorithm for heart disease prediction; the Bacterial Memetic Algorithm based feature selection for surface EMG-based hand motion recognition in long-term use was suggested by [85].…”
mentioning
confidence: 99%
“…The researchers in [33] presented a novel method named IFAB based on an ABC algorithm for feature selection, and the authors in [53] used the FA algorithm for blind image steganalysis. Some other problems tackled in this field comprise fault diagnosis of complex structures with the BFO algorithm [64], skin detection based background removal for enhanced face recognition with adaptive BPSO [89], identifying malicious web domains with BPSO in [88], and web page classification with ACO [83].…”
mentioning
confidence: 99%
“…In this model, grid based ACO was implemented to optimize the parameters of SVM, and feature subset selection was performed through F-statistics. Nemati et al [14] introduced a parallel combined version of ACO and GA for feature selection in protein function prediction.Sarac and Ozel [53] proposed an ACO based feature selection approach for web page classification, in which feature extraction is applied before feature selection to group similar HTML tags together, i.e., to reduce the feature space. More information on ACO based feature selection can be found in [50,54].…”
Section: Existing Feature Selection Methodsmentioning
confidence: 99%
“…Then, features that have a document frequency which is less than 0.1% of the number of documents in the training set are eliminated to remove misspelled words or words which are used very rarely. According to Salton [44] and our previous studies [45][46][47] using document frequency value of terms allows us to eliminate misspelled or unimportant terms from the feature space. The numbers of features obtained according to the three feature extraction method with a 0.1% document frequency filtering are given in Table 4.…”
Section: Feature Extractionmentioning
confidence: 99%