Sources, in the form of selected Facebook pages, can be used as indicators of hate-rich content. Polarized distributed representations created over such content prove superior to generic embeddings in the task of hate speech detection. The same content seems to carry a too weak signal to proxy silver labels in a distant supervised setting. However, this signal is stronger than gold labels which come from a different distribution, leading to re-think the process of annotation in the context of highly subjective judgments.
We describe the systems the RuG Team developed in the context of the Hate Speech Detection Task in Italian Social Media at EVALITA 2018. We submitted a total of eight runs, participating in all four subtasks. The best macro-F1 score in all subtasks was obtained by a Linear SVM, using hate-rich embeddings. Our best system obtains competitive results, by ranking 6th (out of 14) in HaSpeeDe-FB, 3rd (out of 15) in HaSpeeDe-TW, 8th (out of 13) in Cross-HaSpeeDe_FB, and 6th (out of 13) in Cross-HaSpeeDe_TW.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.