DOI: 10.1007/978-3-540-85845-4_9
|View full text |Cite
|
Sign up to set email alerts
|

Multi-value Classification of Very Short Texts

Abstract: Abstract. We introduce a new stacking-like approach for multi-value classification. We apply this classification scheme using Naive Bayes, Rocchio and kNN classifiers on the well-known Reuters dataset. We use part-of-speech tagging for stopword removal. We show that our setup performs almost as well as other approaches that use the full article text even though we only classify headlines. Finally, we apply a Rocchio classifier on a dataset from a Web 2.0 site and show that it is suitable for semi-automated lab… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(11 citation statements)
references
References 7 publications
0
11
0
Order By: Relevance
“…The most prominent approach to adapt classifiers for multi-labeling is binary relevance [26,28]. Other options include the chaining [21] as well as stacking [9,27] of classifiers. While the former is not well-suited for high amounts of considered labels, we also include a variation of the latter idea in our comparison.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…The most prominent approach to adapt classifiers for multi-labeling is binary relevance [26,28]. Other options include the chaining [21] as well as stacking [9,27] of classifiers. While the former is not well-suited for high amounts of considered labels, we also include a variation of the latter idea in our comparison.…”
Section: Related Workmentioning
confidence: 99%
“…As meta-classifiers, we use decision trees with Gini impurity as splitting criterion. To limit complexity, we generate training data only for those meta-classifiers, whose class is among the top 30 of the base-classifier's ranking [9]. We use this decision tree module (abbreviated with the suffix *DT) as an alternative to hard cut-offs in Learning to Rank (see Section 2, and the fixed thresholds in multi-layer perceptrons (see Section 3.2.2).…”
Section: Multi-label Adaptionmentioning
confidence: 99%
See 2 more Smart Citations
“…comments, reviews or web searches) has been intensively studied since 2008 [15,46,27]. There are great benefits in being able to analyse short texts, for example, advertisers might be interested in the sentiment of product reviews on e-commerce sites to more efficiently pair marketing material to content.…”
Section: Introductionmentioning
confidence: 99%