Raiders of the Lost Kek: 3.5 Years of Augmented 4chan Posts from the Politically Incorrect Board

A number of violent far-right attacks in recent years have revealed an apparent connection with `chan culture', not just in the tangible examples of attackers uploading manifestos, final messages and livestreams to chan sites themselves, but in the widespread community support exhibited in some corners of this online subculture where violence is both trivialised and glorified. Commonly, this is manifested in the visual culture present on chan sites, particularly memes, which may be used to promote extreme or even violent narratives under the guise of humour and irony. This paper seeks to understand how the visual culture of chan sites contribute to, and/or encourage violent discourse. In particular, we combine quantitative data scraping, ethnography and visual analysis across tens of chan sites ranging in popularity like 4chan, 8kun, or neinchan between March -- June 2020. Over all we collect a dataset of 135K images across different chans and provide first qualitivative characterization of the most popular images shared across different chans.

Section: Methodsmentioning

confidence: 99%

Memes, Radicalisation, and the Promotion of Violence on Chan Sites

Crawford¹,

Keen²,

Suárez-Tangil³

2021

“…(Aliapoulios et al 2021) published a dataset consisting of 183M posts and 13.25M user profiles from Parler, a Twitter alternative. Last, (Papasavva et al 2020) present a dataset with over 3.3M threads and 134.5M posts from the Politically Incorrect board (/pol/) of the imageboard forum 4chan.…”

Section: Related Workmentioning

confidence: 99%

“…Our dataset provides several opportunities to the research community. First, Voat was evidently the place many banned users and communities moved to after being banned from other platforms (Papasavva et al 2020;Chandrasekharan et al 2017). To this end, our dataset can assist researchers that focus on deplatforming and user migration.…”

mentioning

confidence: 99%

"I Can’t Keep It Up." A Dataset from the Defunct Voat.co News Aggregator

Mekacher

Papasavva

2022

Voat.co was a news aggregator website that shut down on December 25, 2020. The site had a troubled history and was known for hosting various banned subreddits. This paper presents a dataset with over 2.3M submissions and 16.2M comments posted from 113K users in 7.1K subverses (the equivalent of subreddit for Voat). Our dataset covers the whole lifetime of Voat, from its developing period starting on November 8, 2013, the day it was founded, April 2014, up until the day it shut down (December 25, 2020). This work presents the largest and most complete publicly available Voat dataset, to the best of our knowledge. Along with the release of this dataset, we present a preliminary analysis covering posting activity and daily user and subverse registration on the platform so that researchers interested in our dataset can know what to expect. Our data may prove helpful to false news dissemination studies as we analyze the links users share on the platform, finding that many communities rely on alternative news press, like Breitbart and GatewayPundit, for their daily discussions. In addition, we perform network analysis on user interactions finding that many users prefer not to interact with subverses outside their narrative interests, which could be helpful to researchers focusing on polarization and echo chambers. Also, since Voat was one of the platforms banned Reddit communities migrated to, we are confident our dataset will motivate and assist researchers studying deplatforming. Finally, many hateful and conspiratorial communities were very popular on Voat, which makes our work valuable for researchers focusing on toxicity, conspiracy theories, cross-platform studies of social networks, and natural language processing.

“…BERT (Devlin et al 2018), a recent language model from Google, outperforms other traditional techniques like neural networks (Huang, Ou, and Carley 2018) in many NLP tasks including sentiment analysis because it has the ability to capture the context around words. While there are many variations of ABSA with BERT, we choose (Xu et al 2019) as our implementation due to a simplicity while yielding reasonable accuracy when compared to a very complex model like (Rietzler et al 2019).…”

Section: Wordmentioning

confidence: 99%

“…There are several studies trying to understand general activities in web-based discussion forums. (Hine et al 2017), (Papasavva et al 2020) and (Thukral et al 2018) work on understanding properties, trends and characteristics of forums like ephemerality, heavy-tail and anonymity on posts, threads and users. Some focus on specific tasks in forums (Macdonald et al 2015), (Munger et al 2015) (Shrestha et al 2019) and try to identify main actors like hacker users, depressed users and influential users using a variety of techniques including linguistics, behavioral modeling on user activities and graph-based approaches.…”

Section: Related Workmentioning

confidence: 99%

RAFFMAN: Measuring and Analyzing Sentiment in Online Political Forum Discussions with an Application to the Trump Impeachment

Tachaiya

Gharibshah

Esterling

et al. 2021

Given an online forum, how can we quantify changes in user affect towards a person or an idea over time? We argue that online political forums constitute an untapped opportunity for understanding sentiment toward aspects under discussion. However, the analysis of such forums has received little attention from the research community. In this paper, we develop RAFFMAN, a systematic approach to quantify the impact of external events on the affect of forum users towards a concept, such as a person or an entity. First, we develop an approach to capture and quantify the observed activity: we identify related keywords, filter threads, and establish correlations between events and spikes in the activity. Second, we modify and evaluate state-of-the-art NLP techniques to achieve high accuracy (74%) in a three-class sentiment classification problem. As a case study, we deploy our method to quantify the effect of President Trump’s impeachment on several concepts including: President Trump, Speaker Pelosi, and QAnon. Our data consists of 32M posts from Reddit and 4chan over a span of 6 months from September 2019 to February 2020. This initial analysis hints at an increase in political polarization, especially for people’s affect towards the President. Overall, our work is a building block towards mining the affect of online forum user towards a concept, which constitutes a untapped, massive, and publicly-available source of information.