Proceedings of the 2008 ACM Symposium on Applied Computing 2008
DOI: 10.1145/1363686.1363896
|View full text |Cite
|
Sign up to set email alerts
|

Using ambiguity measure feature selection algorithm for support vector machine classifier

Abstract: With the ever-increasing number of documents on the web, digital libraries, news sources, etc., the need of a text classifier that can classify massive amount of data is becoming more critical and difficult. The major problem in text classification is the high dimensionality of feature space. The Support Vector Machine (SVM) classifier is shown to perform consistently better than other text classification algorithms. However, the time taken for training a SVM model is more than other algorithms. We explore the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2009
2009
2020
2020

Publication Types

Select...
3
3
1

Relationship

2
5

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…Many well‐known feature‐selection algorithms are used with SVM to improve its accuracy and efficiency. We use the AM feature‐selection method as a preprocessing step for the support vector machine classifier (Mengle & Goharian, 2008). The features whose AM scores are below a given threshold, i.e., more ambiguous terms, are purged while the features whose AM scores are above a given threshold are used for the SVM learning phase.…”
Section: Introductionmentioning
confidence: 99%
“…Many well‐known feature‐selection algorithms are used with SVM to improve its accuracy and efficiency. We use the AM feature‐selection method as a preprocessing step for the support vector machine classifier (Mengle & Goharian, 2008). The features whose AM scores are below a given threshold, i.e., more ambiguous terms, are purged while the features whose AM scores are above a given threshold are used for the SVM learning phase.…”
Section: Introductionmentioning
confidence: 99%
“…By gradient descent method, we can iteratively update the centroid feature vectors stochastically to get the best centroid by formula (16) and (17) as follows:…”
Section: Smoothing Listwise Ranking Centroid Methodsmentioning
confidence: 99%
“…Furthermore, the training of the naïve Bayes classifier is in linear time, unlike in SVM. We improved the effectiveness of the model by using two feature‐selection algorithms, namely odds ratio (Mladenić & Grobelnik, 1998) and ambiguity measure (AM), which was shown to outperform the existing feature‐selection algorithms (Mengle & Goharian, 2008b). We evaluated the effectiveness of these feature‐selection algorithms on unbalanced datasets and observed that AM is better suited for such tasks.…”
Section: Methodsmentioning
confidence: 99%
“…Ambiguity measure (AM; Mengle & Goharian, 2008b) assigns a high score to a term if it appears consistently in only one specific category. The AM for a term t k with respect to category c i is calculated using Equation 2.…”
Section: Methodsmentioning
confidence: 99%