Which topics spark the most heated debates in social media? Identifying these topics is a first step towards creating systems which pierce echo chambers. In this paper, we perform a systematic methodological study of controversy detection using social media network structure and content. Unlike previous work, rather than identifying controversy in a single hand-picked topic and use domain-specific knowledge, we focus on comparing topics in any domain. Our approach to quantifying controversy is a graph-based threestage pipeline, which involves (i) building a conversation graph about a topic, which represents alignment of opinion among users; (ii) partitioning the conversation graph to identify potential sides of the controversy; and (iii) measuring the amount of controversy from characteristics of the graph. We perform an extensive comparison of controversy measures, as well as graph building approaches and data sources. We use both controversial and non-controversial topics on Twitter, as well as other external datasets. We find that our new random-walk-based measure outperforms existing ones in capturing the intuitive notion of controversy, and show that content features are vastly less helpful in this task.
Which topics spark the most heated debates on social media? Identifying those topics is not only interesting from a societal point of view, but also allows the ltering and aggregation of social media content for disseminating news stories. In this paper, we perform a systematic methodological study of controversy detection by using the content and the network structure of social media.Unlike previous work, rather than study controversy in a single hand-picked topic and use domain-speci c knowledge, we take a general approach to study topics in any domain. Our approach to quantifying controversy is based on a graph-based three-stage pipeline, which involves (i) building a conversation graph about a topic; (ii) partitioning the conversation graph to identify potential sides of the controversy; and (iii) measuring the amount of controversy from characteristics of the graph.We perform an extensive comparison of controversy measures, di erent graph-building approaches, and data sources. We use both controversial and non-controversial topics on Twitter, as well as other external datasets. We nd that our new random-walk-based measure outperforms existing ones in capturing the intuitive notion of controversy, and show that content features are vastly less helpful in this task.
Controversial issues often split the population into groups with opposing views. When such issues emerge on social media, we often observe the creation of "echo chambers," i.e., situations where like-minded people reinforce each other's opinion, but do not get exposed to the views of the opposing side. In this paper we study algorithmic techniques for bridging these chambers, and thus reduce controversy. Specifically, we represent discussions as graphs, and cast our objective as an edge-recommendation problem. The goal of the recommendation is to reduce the controversy score of the graph, measured by a recently-developed metric based on random walks. At the same time, we take into account the acceptance probability of the recommended edges, which represent the probability that the recommended edges materialize in the graph. * This is an abridged version of a homonymous paper that received the best student paper award in ACM WSDM 2017.
We present Spine, an efficient algorithm for finding the "backbone" of an influence network. Given a social graph and a log of past propagations, we build an instance of the independent-cascade model that describes the propagations. We aim at reducing the complexity of that model, while preserving most of its accuracy in describing the data.We show that the problem is inapproximable and we present an optimal, dynamic-programming algorithm, whose search space, albeit exponential, is typically much smaller than that of the brute force, exhaustive-search approach. Seeking a practical, scalable approach to sparsification, we devise Spine, a greedy, efficient algorithm with practically little compromise in quality.We claim that sparsification is a fundamental datareduction operation with many applications, ranging from visualization to exploratory and descriptive data analysis. As a proof of concept, we use Spine on real-world datasets, revealing the backbone of their influence-propagation networks. Moreover, we apply Spine as a pre-processing step for the influence-maximization problem, showing that computations on sparsified models give up little accuracy, but yield significant improvements in terms of scalability.
Society is often polarized by controversial issues, that split the population into groups of opposing views. When such issues emerge on social media, we often observe the creation of 'echo chambers', i.e., situations where like-minded people reinforce each other's opinion, but do not get exposed to the views of the opposing side. In this paper we study algorithmic techniques for bridging these chambers, and thus, reducing controversy. Specifically, we represent the discussion on a controversial issue with an endorsement graph, and cast our problem as an edge-recommendation problem on this graph. The goal of the recommendation is to reduce the controversy score of the graph, which is measured by a recently-developed metric based on random walks. At the same time, we take into account the acceptance probability of the recommended edge, which represents how likely the edge is to materialize in the endorsement graph.We propose a simple model based on a recently-developed user-level controversy score, that is competitive with state-ofthe-art link-prediction algorithms. We thus aim at finding the edges that produce the largest reduction in the controversy score, in expectation. To solve this problem, we propose an efficient algorithm, which considers only a fraction of all the combinations of possible edges. Experimental results show that our algorithm is more efficient than a simple greedy heuristic, while producing comparable score reduction. Finally, a comparison with other state-of-the-art edge-addition algorithms shows that this problem is fundamentally different from what has been studied in the literature.
Echo chambers, i.e., situations where one is exposed only to opinions that agree with their own, are an increasing concern for the political discourse in many democratic countries. This paper studies the phenomenon of political echo chambers on social media. We identify the two components in the phenomenon: the opinion that is shared, and the "chamber" (i.e., the social network) that allows the opinion to "echo" (i.e., be re-shared in the network)and examine closely at how these two components interact. We define a production and consumption measure for social-media users, which captures the political leaning of the content shared and received by them. By comparing the two, we find that Twitter users are, to a large degree, exposed to political opinions that agree with their own. We also find that users who try to bridge the echo chambers, by sharing content with diverse leaning, have to pay a "price of bipartisanship" in terms of their network centrality and content appreciation. In addition, we study the role of "gatekeepers, " users who consume content with diverse leaning but produce partisan content (with a single-sided leaning), in the formation of echo chambers. Finally, we apply these findings to the task of predicting partisans and gatekeepers from social and content features. While partisan users turn out relatively easy to identify, gatekeepers prove to be more challenging.
User generated content that appears on weblogs, wikis and social networks has been increasing at an unprecedented rate. The wealth of information produced by individuals from different geographical locations presents a challenging task of intelligent processing.In this paper, we introduce a methodology to identify notable geographically focused events out of this collection of user generated information. At the heart of our proposal lie efficient algorithms that identify geographically focused information bursts, attribute them to demographic factors and identify sets of descriptive keywords. We present the results of a prototype evaluation of our algorithms on BlogScope, a large-scale social media warehousing platform. We demonstrate the scalability and practical utility of our proposal running on top of a multi-terabyte text collection.
We study the evolution of long-lived controversial debates as manifested on Twitter from 2011 to 2016. Speci cally, we explore how the structure of interactions and content of discussion varies with the level of collective attention, as evidenced by the number of users discussing a topic. Spikes in the volume of users typically correspond to external events that increase the public attention on the topic -as, for instance, discussions about 'gun control' often erupt after a mass shooting.This work is the rst to study the dynamic evolution of polarized online debates at such scale. By employing a wide array of network and content analysis measures, we nd consistent evidence that increased collective attention is associated with increased network polarization and network concentration within each side of the debate; and overall more uniform lexicon usage across all users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.