Social networks are huge continuous sources of information that can be used to analyze people's behavior and thoughts. Our goal is to extract such information and predict political inclinations of users. In particular, this paper investigates the importance of syntactic features of texts written by users in the process. Our hypothesis is that people belonging to the same political party write in similar ways, thus they can be classified properly on the basis of the words that they use. We analyze tweets because Twitter is commonly used in Italy for discussing about politics; moreover, it provides an official API that can be easily exploited for data extraction. Many classifiers were applied to different kinds of features and NLP vectorization methods in order to obtain the best method capable of confirming our hypothesis. To evaluate their accuracy, a set of current Italian deputies with consistent activity in Twitter has been selected as ground truth, and we have then predicted their political party. Using the results of our analysis, we also got interesting insights into current Italian politics.
With the increase of digital interaction, social networks are becoming an essential ingredient of our life, by progressively becoming the dominant media, e.g. in influencing political choices. Interaction within social networks tends to take place within communities, sets of social accounts which share friendships, ideas, interests and passions; detecting digital communities is of increasing relevance, from a social and economical point of view.In this paper, we argue that the vocabulary of terms used in social interaction is a very distinctive feature of a community, hence it can be effectively used for community detection. We show that, by inspecting the vocabulary used by tweets, we can achieve very efficient classifiers and predictors of account membership within a given community. We describe the syntactic and semantic features that best constitute a vocabulary, then we provide their comparative evaluation and select the best features for the task, and finally we illustrate several applications of our approach to concrete community detection scenarios.
Online social media are changing the news industry and revolutionizing the traditional role of journalists and newspapers. In this scenario, investigating the behaviour of users in relationship to news sharing is relevant, as it provides means for understanding the impact of online news, their propagation within social communities, their impact on the formation of opinions, and also for effectively detecting individual stances relative to specific news or topics.Our contribution is two-fold. First, we build a robust pipeline for collecting datasets describing news sharing; the pipeline takes as input a list of news sources and generates a large collection of articles, of the accounts that provide them on the social media either directly or by retweeting, and of the social activities performed by these accounts. Second, we also provide a large-scale dataset, built using the aforementioned tool, that can be used to study the social behavior of Twitter users and their involvement in the dissemination of news items. Finally we show an application of our data collection in the context of political stance classification and we suggest other potential usages of the presented resources.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.