Until recently, social media were seen to promote democratic discourse on social and political issues. However, this powerful communication ecosystem has come under scrutiny for allowing hostile actors to exploit online discussions in an attempt to manipulate public opinion. A case in point is the ongoing U.S. Congress investigation of Russian interference in the 2016 U.S. election campaign, with Russia accused of, among other things, using trolls (malicious accounts created for the purpose of manipulation) and bots (automated accounts) to spread propaganda and politically biased information. In this study, we explore the effects of this manipulation campaign, taking a closer look at users who re-shared the posts produced on Twitter by the Russian troll accounts publicly disclosed by U.S. Congress investigation. We collected a dataset of 13 million election-related posts shared on Twitter in the year of 2016 by over a million distinct users. This dataset includes accounts associated with the identified Russian trolls as well as users sharing posts in the same time period on a variety of topics around the 2016 elections. We use label propagation to infer the users' ideology based on the news sources they share. We are able to classify a large number of the users as liberal or con-
In recent years, social media has revolutionized how people communicate and share information. One function of social media, besides connecting with friends, is sharing opinions with others. Micro blogging sites, like Twitter, have often provided an online forum for social activism. When users debate about controversial topics on social media, they typically share different types of evidence to support their claims. Classifying these types of evidence can provide an estimate for how adequately the arguments have been supported. We first introduce a manually built gold standard dataset of 3000 tweets related to the recent FBI and Apple encryption debate. We develop a framework for automatically classifying six evidence types typically used on Twitter to discuss the debate. Our findings show that a Support Vector Machine (SVM) classifier trained with n-gram and additional features is capable of capturing the different forms of representing evidence on Twitter, and exhibits significant improvements over the unigram baseline, achieving a F 1 macroaveraged of 82.8%.
The ease with which information can be shared on social media has opened it up to abuse and manipulation. One example of a manipulation campaign that has garnered much attention recently was the alleged Russian interference in the 2016 U.S. elections, with Russia accused of, among other things, using trolls and malicious accounts to spread misinformation and politically biased information. To take an in-depth look at this manipulation campaign, we collected a dataset of 13 million election-related posts shared on Twitter in 2016 by over a million distinct users. This dataset includes accounts associated with the identified Russian trolls as well as users sharing posts in the same time period on a variety of topics around the 2016 elections. To study how these trolls attempted to manipulate public opinion, we identified 49 theoretically grounded linguistic markers of deception and measured their use by troll and non-troll accounts. We show that deceptive language cues can help to accurately identify trolls, with average F1 score of 82% and recall 88%.
Social media have enabled a revolution in user-generated content. They allow users to connect, build community, produce and share content, and publish opinions. To better understand online users' attitudes and opinions, we use stance classification. Stance classification is a relatively new and challenging approach to deepen opinion mining by classifying a user's stance in a debate. Our stance classification use case is tweets that were related to the spring 2016 debate over the FBI's request that Apple decrypt a user's iPhone. In this "encryption debate," public opinion was polarized between advocates for individual privacy and advocates for national security. We propose a machine learning approach to classify stance in the debate, and a topic classification that uses lexical, syntactic, Twitter-specific, and argumentative features as a predictor for classifications. Models trained on these feature sets showed significant increases in accuracy relative to the unigram baseline.
In this paper, we evaluate the predictability of tweets associated with controversial versus non-controversial topics. As a first step, we crowd-sourced the scoring of a predefined set of topics on a Likert scale from non-controversial to controversial. Our feature set entails and goes beyond sentiment features, e.g., by leveraging empathic language and other features that have been previously used, but are new for this particular study. We find focusing on the structural characteristics of tweets to be beneficial for this task. Using a combination of emphatic, language-specific, and Twitter-specific features for supervised learning resulted in 87% accuracy (F1) for cross-validation of the training set and 63.4% accuracy when using the test set. Our analysis shows that features specific to Twitter or social media in general are more prevalent in tweets on controversial topics than in non-controversial ones. To test the premise of the paper, we conducted two additional sets of experiments, which led to mixed results. This finding will inform our future investigations into the relationship between language use on social media and the perceived controversiality of topics.
The coronavirus disease of 2019 (COVID-19) has a huge impact on economies and societies around the world. While governments are taking extreme measures to reduce the spread of the virus, people are getting affected by these new measures. With restrictions like lockdown and social distancing, it became important to understand the emotional response of the public towards the pandemic. In this paper, we study the reaction of Saudi Arabia citizens towards the pandemic. We utilize a collection of Arabic tweets that were sent during 2020, primarily through hashtags that were originated from Saudi Arabia. Our results showed that people had kept a positive reaction towards the pandemic. This positive reaction was at its highest at the beginning of the COVID-19 crisis and started to decline as time passes. Overall, the results showed that people were so supportive of each other through this pandemic. This research can help researchers and policymakers in understanding the emotional effect of a pandemic on societies.
COVID-19 has had broad disruptive effects on economies, healthcare systems, governments, societies, and individuals. Uncertainty concerning the scale of this crisis has given rise to countless rumors, hoaxes, and misinformation. Much of this type of conversation and misinformation about the pandemic now occurs online and in particular on social media platforms like Twitter. This study analysis incorporated a data-driven approach to map the contours of misinformation and contextualize the COVID-19 pandemic with regards to socio-religious-political information. This work consists of a combined system bridging quantitative and qualitative methodologies to assess how information-exchanging behaviors can be used to minimize the effects of emergent misinformation. The study revealed that the social media platforms detected the most significant source of rumors in transmitting information rapidly in the community. It showed that WhatsApp users made up about 46% of the source of rumors in online platforms, while, through Twitter, it demonstrated a declining trend of rumors by 41%. Moreover, the results indicate the second-most common type of misinformation was provided by pharmaceutical companies; however, a prevalent type of misinformation spreading in the world during this pandemic has to do with the biological war. In this combined retrospective analysis of the study, social media with varying approaches in public discourse contributes to efficient public health responses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.