Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media 2017
DOI: 10.18653/v1/w17-1105
|View full text |Cite
|
Sign up to set email alerts
|

Character-based Neural Embeddings for Tweet Clustering

Abstract: In this paper we show how the performance of tweet clustering can be improved by leveraging character-based neural networks. The proposed approach overcomes the limitations related to the vocabulary explosion in the word-based models and allows for the seamless processing of the multilingual content. Our evaluation results and code are available on-line 1 .

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…Related tasks include topic detection on Twitter task (Papadopoulos et al, 2014;Litvak et al, 2016;Zhou et al, 2016;Vakulenko et al, 2017), binary classification of Tweets (Rosenthal et al, 2017), classification of news related or political stance Tweets (Ghelani et al, 2017;Johnson and Goldwasser, 2016;Volkova et al, 2017), classification of news related articles (Ribeiro et al, 2017), and other classifications in NLP. Binary classification of texts and classification of news related texts and articles are most closely related to this task.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Related tasks include topic detection on Twitter task (Papadopoulos et al, 2014;Litvak et al, 2016;Zhou et al, 2016;Vakulenko et al, 2017), binary classification of Tweets (Rosenthal et al, 2017), classification of news related or political stance Tweets (Ghelani et al, 2017;Johnson and Goldwasser, 2016;Volkova et al, 2017), classification of news related articles (Ribeiro et al, 2017), and other classifications in NLP. Binary classification of texts and classification of news related texts and articles are most closely related to this task.…”
Section: Related Workmentioning
confidence: 99%
“…However, choosing important information for news reports from Twitter is very tough, because Twitter contains a vast amount of posts. For this reason, many researchers have studied how to extract important posts for each purpose (Papadopoulos et al, 2014;Litvak et al, 2016;Zhou et al, 2016;Vakulenko et al, 2017). A system using Neural Networks (NNs) has been developed by using models that are trained by extracting Tweets in factual TV program production, and these systems extract "Event-describing Tweets (EVENT)" which include incidents or accidents information for news reports from a large amount of Tweets (Miyazaki et al, 2017).…”
Section: Introductionmentioning
confidence: 99%
“…OLDA+HC. Additionally, we also evaluate a tweet clustering approach [35] that uses character-based tweet embeddings (i.e. Tweet2Vec [8]) and outperforms the winner [13] of the 2014 SNOW breaking news detection competition 2,3 which was defined as a topic detection task.…”
Section: Datasetsmentioning
confidence: 99%
“…This method was named as Tweet2Vec+HC. All document clustering baselines employ a hierarchical agglomerative clustering algorithm as it is proven to be effective in [35].…”
Section: Datasetsmentioning
confidence: 99%
See 1 more Smart Citation