2016
DOI: 10.1007/978-3-319-30671-1_36
|View full text |Cite
|
Sign up to set email alerts
|

Topics in Tweets: A User Study of Topic Coherence Metrics for Twitter Data

Abstract: Abstract. Twitter offers scholars new ways to understand the dynamics of public opinion and social discussions. However, in order to understand such discussions, it is necessary to identify coherent topics that have been discussed in the tweets. To assess the coherence of topics, several automatic topic coherence metrics have been designed for classical document corpora. However, it is unclear how suitable these metrics are for topic models generated from Twitter datasets. In this paper, we use crowdsourcing t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
37
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
2
1

Relationship

4
3

Authors

Journals

citations
Cited by 28 publications
(38 citation statements)
references
References 16 publications
0
37
0
Order By: Relevance
“…Recently, a large-scale user study in [2] evaluated several coherence metrics including the Wikipedia PMI-based and WordNet-based metrics on tweets. Fang et al [2] showed that a newly proposed coherence metric leveraging a Twitter background dataset, called the Twitter PMI-based metric (hereafter, T-PMI), has a markablely high agreement with human judgements on tweet corpora.…”
Section: Background and Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Recently, a large-scale user study in [2] evaluated several coherence metrics including the Wikipedia PMI-based and WordNet-based metrics on tweets. Fang et al [2] showed that a newly proposed coherence metric leveraging a Twitter background dataset, called the Twitter PMI-based metric (hereafter, T-PMI), has a markablely high agreement with human judgements on tweet corpora.…”
Section: Background and Related Workmentioning
confidence: 99%
“…For example, Newman et al [8] proposed a Pointwise Mutual Information(PMI)-based metric using Wikipedia as a background dataset to evaluate the coherence of a topic from news articles and books. More recently, a new coherence PMI-based metric using a Twitter background has been proposed for tweet corpora, and was found to be the closest to human judgements [2].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…A higher score indicates that the topic is easier to understand. Following [22,23], we use a word embedding (WE) representations-based coherence metric to evaluate the coherence of the generated topics, which has been reported to have a high agreement with human judgments. In order to capture the semantic similarity of the latest hashtags and Twitter handle names, we train our WE model using 200 million English tweets posted from 08/2015 to 08/2016.…”
Section: Methodsmentioning
confidence: 99%
“…a Twi er user) into a community. However, while topic modelling approaches and classi cation techniques have been widely used, challenges still exist, such as 1) existing topic modelling approaches can generate topics lacking of coherence for social media data [4,10]; 2) it is not easy to evaluate the coherence of topics [2,3]; 3) it can be challenging to generate a large training dataset for developing a social media user classi er. Hence, we identify four tasks to solve these problems and assist social scientists.…”
Section: Introductionmentioning
confidence: 99%