The internet-based incel subculture has evolved over the past decade on a number of different platforms. The subculture is known to be toxic and has become associated with several high-profile cases of lethal violence. In this paper, we study the level of toxic language and its targets on three large incel forums: incels.co, lookism.net and looksmax.me. These three forums are the most well-known and active online platforms where incels meet and discuss. Our results show that even though usage of toxic language is pervasive on all three forums, they exhibit significant differences in the composition of their toxicity. These differences correspond to different groups or philosophies within the incel communites.
In this study, we examined the possibility to extract personality traits from a text. We created an extensive dataset by having experts annotate personality traits in a large number of texts from multiple online sources. From these annotated texts, we selected a sample and made further annotations ending up in a large low-reliability dataset and a small high-reliability dataset. We then used the two datasets to train and test several machine learning models to extract personality from text, including a language model. Finally, we evaluated our best models in the wild, on datasets from different domains. Our results show that the models based on the small high-reliability dataset performed better (in terms of R 2 ) than models based on large low-reliability dataset. Also, language model based on small high-reliability dataset performed better than the random baseline. Finally, and more importantly, the results showed our best model did not perform better than the random baseline when tested in the wild. Taken together, our results show that determining personality traits from a text remains a challenge and that no firm conclusions can be made on model performance before testing in the wild.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.