Distributional Term Representations for Short-Text Categorization

Cabrera, Juan Manuel; Montes-y-Gómez, Manuel

doi:10.1007/978-3-642-37256-8_28

Cited by 8 publications

(5 citation statements)

References 21 publications

(28 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Cabrera et al [2] proposed a semi-supervised short text classification method, Distributional Term Representations (DTRs), which enriched document representations based on contextual information that overcame the small-length and high-sparseness short text to some extent.…”

Section: Related Workmentioning

confidence: 99%

Hybrid Attention Networks for Chinese Short Text Classiﬁcation

Zhou

Cao

et al. 2018

CyS

View full text Add to dashboard Cite

To improve the classification performance for Chinese short text with automatic semantic feature selection, in this paper we propose the Hybrid Attention Networks (HANs) which combines the word-and character-level selective attentions. The model firstly applies RNN and CNN to extract the semantic features of texts.Then it captures class-related attentive representation from word-and character-level features. Finally, all of the features are concatenated and fed into the output layer for classification. Experimental results on 32-class and 5-class datasets show that, our model outperforms multiple baselines by combining not only the word-and character-level features of the texts, but also class-related semantic features by attentive mechanism.

show abstract

Section: Related Workmentioning

confidence: 99%

Hybrid Attention Networks for Chinese Short Text Classiﬁcation

Zhou

Cao

et al. 2018

CyS

View full text Add to dashboard Cite

show abstract

“…La idea de esta representación distribucional es que la semántica de un término t i puede revelarse a partir de otros términos con los que co-ocurre dentro de un documento de la colección [2,9]. De esta forma, cada término en el conjunto de términosúnicos (vocabulario total) de la colección de documentos T , t i ∈ T , es representado con un vector de pesos t i =< w 1 , w 2 , ..., w n > donde w j representa la contribución del término j a la descripción semántica de t i como lo indica la siguiente fórmula:…”

Section: Etapa 2 Enriquecimiento Del Léxico Inicial Mediante Una Repunclassified

Generación y enriquecimiento automático de recursos léxicos para el análisis de sentimientos

Real-Flores¹,

García-Mendoza²,

Calderón-Casanova³

et al. 2017

RCS

View full text Add to dashboard Cite

show abstract

“…These extensions aim at helping naïve Bayes to deal with missing information, usually, at the attribute level. For instance by equipping the classifiers with mechanisms to work under highlysparse representations (e.g., in short text categorization) [15,2,7,19]. These methods are mostly based on smoothing attribute-class probabilities and often use co-occurrence statistics.…”

Section: Extending Naïvementioning

confidence: 99%

“…This paper motivates further work on extending this model for early text classification. For instance, one can define/modify adaptive priors that change as the value of t increases; we can implement the same idea with methods that take into account termdependencies (see e.g., [6,17,20]) in order to increase the predictive power of the classifier; also one can adopt advanced/alternative smoothing techniques to account for partial and missing information properly [15,2,7]; as well as many other possibilities. The main goal of this paper is to show that naïve Bayes can be used for early text classification and that its performance is competitive with the single existing solution to this problem.…”

Section: Early Naïve Bayesmentioning

confidence: 99%

Early text classification: a Naïve solution

Gómez

Villaseñor

Errecalde

2016

Proceedings of the 7th Workshop on Computational Approaches To Subjectivity, Sentiment and Social Media Analysis

Self Cite

View full text Add to dashboard Cite

Text classification is a widely studied problem, and it can be considered solved for some domains and under certain circumstances. There are scenarios, however, that have received little or no attention at all, despite its relevance and applicability. One of such scenarios is early text classification, where one needs to know the category of a document by using partial information only. A document is processed as a sequence of terms, and the goal is to devise a method that can make predictions as fast as possible. The importance of this variant of the text classification problem is evident in domains like sexual predator detection, where one wants to identify an offender as early as possible. This paper analyzes the suitability of the standard naïve Bayes classifier for approaching this problem. Specifically, we assess its performance when classifying documents after seeing an increasingly number of terms. A simple modification to the standard naïve Bayes implementation allows us to make predictions with partial information. To the best of our knowledge Naïve Bayes has not been used for this purpose before. Throughout an extensive experimental evaluation we show the effectiveness of the classifier for early text classification. What is more, we show that this simple solution is very competitive when compared with state of the art methodologies that are more elaborated. We foresee our work will pave the way for the development of more effective early text classification techniques based in the naïve Bayes formulation.

show abstract

Distributional Term Representations for Short-Text Categorization

Cited by 8 publications

References 21 publications

Hybrid Attention Networks for Chinese Short Text Classiﬁcation

Hybrid Attention Networks for Chinese Short Text Classiﬁcation

Generación y enriquecimiento automático de recursos léxicos para el análisis de sentimientos

Early text classification: a Naïve solution

Contact Info

Product

Resources

About