2013
DOI: 10.1007/978-3-642-37256-8_28
|View full text |Cite
|
Sign up to set email alerts
|

Distributional Term Representations for Short-Text Categorization

Abstract: Abstract. Everyday, millions of short-texts are generated for which effective tools for organization and retrieval are required. Because of the tiny length of these documents and of their extremely sparse representations, the direct application of standard text categorization methods is not effective. In this work we propose using distributional term representations (DTRs) for short-text categorization. DTRs represent terms by means of contextual information, given by document occurrence and term co-occurrence… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0
1

Year Published

2014
2014
2020
2020

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 21 publications
(28 reference statements)
0
4
0
1
Order By: Relevance
“…Cabrera et al [2] proposed a semi-supervised short text classification method, Distributional Term Representations (DTRs), which enriched document representations based on contextual information that overcame the small-length and high-sparseness short text to some extent.…”
Section: Related Workmentioning
confidence: 99%
“…Cabrera et al [2] proposed a semi-supervised short text classification method, Distributional Term Representations (DTRs), which enriched document representations based on contextual information that overcame the small-length and high-sparseness short text to some extent.…”
Section: Related Workmentioning
confidence: 99%
“…La idea de esta representación distribucional es que la semántica de un término t i puede revelarse a partir de otros términos con los que co-ocurre dentro de un documento de la colección [2,9]. De esta forma, cada término en el conjunto de términosúnicos (vocabulario total) de la colección de documentos T , t i ∈ T , es representado con un vector de pesos t i =< w 1 , w 2 , ..., w n > donde w j representa la contribución del término j a la descripción semántica de t i como lo indica la siguiente fórmula:…”
Section: Etapa 2 Enriquecimiento Del Léxico Inicial Mediante Una Repunclassified
“…These extensions aim at helping naïve Bayes to deal with missing information, usually, at the attribute level. For instance by equipping the classifiers with mechanisms to work under highlysparse representations (e.g., in short text categorization) [15,2,7,19]. These methods are mostly based on smoothing attribute-class probabilities and often use co-occurrence statistics.…”
Section: Extending Naïvementioning
confidence: 99%
“…This paper motivates further work on extending this model for early text classification. For instance, one can define/modify adaptive priors that change as the value of t increases; we can implement the same idea with methods that take into account termdependencies (see e.g., [6,17,20]) in order to increase the predictive power of the classifier; also one can adopt advanced/alternative smoothing techniques to account for partial and missing information properly [15,2,7]; as well as many other possibilities. The main goal of this paper is to show that naïve Bayes can be used for early text classification and that its performance is competitive with the single existing solution to this problem.…”
Section: Early Naïve Bayesmentioning
confidence: 99%