Stance Detection is the task of automatically determining whether the author of a text is in favor, against, or neutral towards a given target. In this paper we investigate the portability of tools performing this task across different languages, by analyzing the results achieved by a Stance Detection system (i.e. MultiTACOS) trained and tested in a multilingual setting. First of all, a set of resources on topics related to politics for English, French, Italian, Spanish and Catalan is provided which includes: novel corpora collected for the purpose of this study, and benchmark corpora exploited in Stance Detection tasks and evaluation exercises known in literature. We focus in particular on the novel corpora by describing their development and by comparing them with the benchmarks. Second, MultiTACOS is applied with different sets of features especially designed for Stance Detection, with a specific focus to exploring and combining both features based on the textual content of the tweet (e.g., style and affective load) and features based on contextual information that do not emerge directly from the text. Finally, for better highlighting the contribution of the features that most positively affect system performance in the multilingual setting, a features analysis is provided, together with a qualitative analysis of the misclassified tweets for each of the observed languages, devoted to reflect on the open challenges.
The number of communications and messages generated by users on social media platforms has progressively increased in the last years. Therefore, the issue of developing automated systems for a deep analysis of users' generated contents and interactions is becoming increasingly relevant. In particular, when we focus on the domain of online political debates, interest for the automatic classification of users' stance towards a given entity, like a controversial topic or a politician, within a polarized debate is significantly growing. In this paper we propose a new model for stance detection in Twitter, where authors' messages are not considered in isolation, but in a diachronic perspective for shedding light on users' opinion shift dynamics along the temporal axis. Moreover, different types of social network community, based on retweet, quote, and reply relations were analyzed, in order to extract network-based features to be included in our stance detection model. The model has been trained and evaluated on a corpus of Italian tweets where users were discussing on a highly polarized debate in Italy, i.e. the 2016 referendum on the reform of the Italian Constitution. The development of a new annotated corpus for stance is described. Analysis and classification experiments show that network-based features help in detecting stance and confirm the importance of modeling stance in a diachronic perspective.
Abstract. Stance detection, the task of identifying the speaker's opinion towards a particular target, has attracted the attention of researchers. This paper describes a novel approach for detecting stance in Twitter. We define a set of features in order to consider the context surrounding a target of interest with the final aim of training a model for predicting the stance towards the mentioned targets. In particular, we are interested in investigating political debates in social media. For this reason we evaluated our approach focusing on two targets of the SemEval-2016 Task 6 on Detecting stance in tweets, which are related to the political campaign for the 2016 U.S. presidential elections: Hillary Clinton vs. Donald Trump. For the sake of comparison with the state of the art, we evaluated our model against the dataset released in the SemEval-2016 Task 6 shared task competition. Our results outperform the best ones obtained by participating teams, and show that information about enemies and friends of politicians help in detecting stance towards them.
Abstract. This paper focuses on the role of social relations within social media in the formation of public opinion. We propose to combine the detection of the users' stance towards BREXIT, carried out by content analysis of Twitter messages, and the exploration of their social relations, by relying on social network analysis. The analysis of a novel Twitter corpus on the BREXIT debate, developed for our purposes, shows that likeminded individuals (sharing the same opinion towards the specific issue) are likely belonging to the same social network community. Moreover, opinion driven homophily is exhibited among neighbours. Interestingly, users' stance shows diachronic evolution.
In this work, we apply network science to analyse almost 6 M tweets about the debate around immigration in Italy, collected between 2018 and 2019, when many related events captured media outlets’ attention. Our aim was to better understand the dynamics underlying the interactions on social media on such a delicate and divisive topic, which are the actors that are leading the discussion, and whose messages have the highest chance to reach out the majority of the accounts that are following the debate. The debate on Twitter is represented with networks; we provide a characterisation of the main clusters by looking at the highest in-degree nodes in each one and by analysing the text of the tweets of all the users. We find a strongly segregated network which shows an explicit interplay with the Italian political and social landscape, that however seems to be disconnected from the actual geographical distribution and relocation of migrants. In addition, quite surprisingly, the influencers and political leaders that apparently lead the debate, do not necessarily belong to the clusters that include the majority of nodes: we find evidence of the existence of a `silent majority’ that is more connected to accounts who expose a more positive stance toward migrants, while leaders whose stance is negative attract apparently more attention. Finally, we see that the community structure clearly affects the diffusion of content (URLs) by identifying the presence of both local and global trends of diffusion, and that communities tend to display segregation regardless of their political and cultural background. In particular, we observe that messages that spread widely in the two largest clusters, whose most popular members are also notoriously at the opposite sides of the political spectrum, have a very low chance to get visibility into other clusters.
Abstract-Political debates about a reform may sparkle national controversies, by leading members of the community to polarize their opinions and sentiment about the topic addressed. With the rise of social media like Twitter users are encouraged to voice and share their strong and polarized views and in general people are exposed to broader viewpoints than they were before. The large amount of user-generated social data available is a great opportunity to investigate the communicative behaviors emerging in the context of such political debates and to shed some light on the way communities of users with different roles in the society and different political sentiment interact. In this paper we focussed on communications in Twitter around the reform of marriage in France in 2012 and 2013 -"Le Mariage Pour Tous" -which had been the subject of debate and controversy. We collected a corpus of tweets tagged by the hashtag #mariagepourtous, created to mark the messages about the reform. We applied different kinds of analysis on our dataset based on linguistic and non linguistic features of the observed data in order to investigate the communicative behavior in using subjective and evaluative language on a political topic. The analysis leaded also to reflect on the impact of different typologies of users involved in the virtual debate which included both political messages created by media organizations and by other individual users, from ordinary citizens to politicians or celebrities.
We report on the collection of social media messages — from Twitter in particular — in the Italian language that is continuously going on since 2012 at the University of Turin. A number of smaller datasets have been extracted from the main collection and enriched with different kinds of annotations for linguistic purposes. Moreover, a few extra datasets have been collected independently and are now in the process of being merged with the main collection. We aim at making the resource available to the community to the best of our possibility, in accordance with the Terms of Service provided by the platforms where data have been gathered from.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.