We consider the task of predicting the political views of VKontakte users based on textual data posted on their personal pages. The analysis of social media is an increasingly important area in digital humanities, opinion mining, and natural language processing. Nowadays, social networks contain a lot of meaningful and freely distributed data describing the views and moods of society. First, we analyzed information from user pages of various categories identified on the basis of the VKontakte political polarization. Personal profiles contain textual categorical values and text fields that are filled in by the user in a free form. We encoded categorical features as a one-hot numeric array and used the Bag-of-Words model for free-form text representation. Next, we applied a simple machine learning classifier based on Linear Support Vector Machines to the textual data of the custom page. We have shown that the classifier is better at separating groups of social media users with opposite political views than adherents of closer political ideologies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.