BackgroundTopic lifecycle analysis on social networks aims to analyze and track how topics are born from user-generated content, and how they evolve. Twitter researchers have no agreed-upon definition of topics; topics on Twitter are typically derived in the form of (a) frequently used hashtags, or (b) keywords showing sudden trends of large occurrence in a short span of time (“bursty keywords”), or (c) concepts latent within the tweets that are grouped using variations of semantic clustering techniques.MethodsIn the current paper, we jointly model the hashtags present and the semantic concepts embedded in the content, which in turn helps us identify hashtag groups that define a “topic”—a concept space—that are used by a large number of tweets.ResultsWe observe that different hashtags belonging to a given cluster are more prominent compared to the others, at different times. We further observe that the participation and influence levels of the different users play important roles in determining which hashtag would be more prominent than the others at given times. We thus observe topics to often morph from one to the other (via morphing of dominant hashtags representing the same semantic concept space), rather than becoming extinct outright, which is a novel insight about topic lifecycles. We further present novel observations about the role of users in determining the lifecycle of discussion topics on Twitter.ConclusionsWe infer that topic lifecycles are governed by user interests, and not by user influence, which is a key observation made by our work.
Topic lifecycle analysis on Twitter, a branch of study that investigates Twitter topics from their birth through lifecycle to death, has gained immense mainstream research popularity. In the literature, topics are often treated as one of (a) hashtags (independent from other hashtags), (b) a burst of keywords in a short time span or (c) a latent concept space captured by advanced text analysis methodologies, such as Latent Dirichlet Allocation (LDA). The first two approaches are not capable of recognizing topics where different users use different hashtags to express the same concept (semantically related), while the third approach misses out the user's explicit intent expressed via hashtags. In our work, we use a word embedding based approach to cluster different hashtags together, and the temporal concurrency of the hashtag usages, thus forming topics (a semantically and temporally related group of hashtags). We present a novel analysis of topic lifecycles with respect to communities. We characterize the participation of social communities in the topic clusters, and analyze the lifecycle of topic clusters with respect to such participation. We derive first-of-its-kind novel insights with respect to the complex evolution of topics over communities and time: temporal morphing of topics over hashtags within communities, how the hashtags die in some communities but morph into some other hashtags in some other communities (that, it is a community-level phenomenon), and how specific communities adopt to specific hashtags. Our work is fundamental in the space of topic lifecycle modeling and understanding in communities: it redefines our understanding of topic lifecycles and shows that the social boundaries of topic lifecycles are deeply ingrained with community behavior.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.