Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Confere 2015
DOI: 10.3115/v1/p15-2105
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Keyword Extraction on Twitter

Abstract: In this paper, we build a corpus of tweets from Twitter annotated with keywords using crowdsourcing methods. We identify key differences between this domain and the work performed on other domains, such as news, which makes existing approaches for automatic keyword extraction not generalize well on Twitter datasets. These datasets include the small amount of content in each tweet, the frequent usage of lexical variants and the high variance of the cardinality of keywords present in each tweet. We propose metho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

2
32
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 46 publications
(34 citation statements)
references
References 26 publications
2
32
0
Order By: Relevance
“…Because multiple tweets are usually organized by topic, many document-level approaches can also be adopted to achieve the task. In contrast with the previous methods, Marujo et al (2015) focused on the task of extracting keywords from single tweets. They used several unsupervised methods and word embeddings to construct features.…”
Section: Introductionmentioning
confidence: 99%
“…Because multiple tweets are usually organized by topic, many document-level approaches can also be adopted to achieve the task. In contrast with the previous methods, Marujo et al (2015) focused on the task of extracting keywords from single tweets. They used several unsupervised methods and word embeddings to construct features.…”
Section: Introductionmentioning
confidence: 99%
“…However, our target is to extract keyphrases from single tweets. Along this line, the approaches proposed by Marujo et al [30] and Zhang et al [63] work for single tweets. Marujo et al [30] showed that word embeddings in a system such as MAUI [31] performs better than the TF-IDF baseline [53].…”
Section: Related Workmentioning
confidence: 99%
“…Along this line, the approaches proposed by Marujo et al [30] and Zhang et al [63] work for single tweets. Marujo et al [30] showed that word embeddings in a system such as MAUI [31] performs better than the TF-IDF baseline [53]. Furthermore, the Joint-Layer-RNN model proposed by Zhang et al [63] was shown to be even better than the model proposed by Marujo et al [30].…”
Section: Related Workmentioning
confidence: 99%
“…Recently, keyphrase extraction methods have been extended to social media texts (Bellaachia & Al-Dhelaan, 2012;Marujo et al, 2015;Zhang et al, 2016;Zhao et al, 2011); for example, Twitter and Sina Weibo. Marujo et al (2015) extracted a keyphrase from a single text. This model is based on the MAUI toolkit (2010), which trains a decision tree over a large set of manually engineered features; for example, TF-IDF score, Brown clustering information, and word vectors.…”
Section: Related Workmentioning
confidence: 99%
“…To date, most studies on related fields have focused on the usage of ranking-based models (Bellaachia & Al-Dhelaan, 2012;Marujo et al, 2015) or sequence-tagging models (Zhang, Wang, Gong, & Huang, 2016) via treating messages as independent documents. However, because of the severe data sparsity issue, it is arguable that these methods are suboptimal in recognizing key content from short and informal microblog messages.…”
Section: Introductionmentioning
confidence: 99%