Michael Wiegand scite author profile

This paper presents a survey on hate speech detection. Given the steadily growing body of social media content, the amount of online hate speech is also increasing. Due to the massive scale of the web, methods that automatically detect hate speech are required. Our survey describes key areas that have been explored to automatically recognize these types of utterances using natural language processing. We also discuss limits of those approaches.

show abstract

Inducing a Lexicon of Abusive Words – a Feature-Based Approach

Wiegand¹,

Ruppenhofer²,

Schmidt³

et al. 2018

122

146

View full text Add to dashboard Cite

We address the detection of abusive words. The task is to identify such words among a set of negative polar expressions. We propose novel features employing information from both corpora and lexical resources. These features are calibrated on a small manually annotated base lexicon which we use to produce a large lexicon. We show that the word-level information we learn cannot be equally derived from a large dataset of annotated microposts. We demonstrate the effectiveness of our (domain-independent) lexicon in the crossdomain detection of abusive microposts.

show abstract

Untitled

Wiegand¹,

Ruppenhofer²,

Kleinbauer³

2019

115

View full text Add to dashboard Cite

We discuss the impact of data bias on abusive language detection. We show that classification scores on popular datasets reported in previous work are much lower under realistic settings in which this bias is reduced. Such biases are most notably observed on datasets that are created by focused sampling instead of random sampling. Datasets with a higher proportion of implicit abuse are more affected than datasets with a lower proportion.

show abstract

A survey of noise reduction methods for distant supervision

Roth

Barth

Wiegand

et al. 2013

View full text Add to dashboard Cite

We survey recent approaches to noise reduction in distant supervision learning for relation extraction. We group them according to the principles they are based on: at-least-one constraints, topic-based models, or pattern correlations. Besides describing them, we illustrate the fundamental differences and attempt to give an outlook to potentially fruitful further research. In addition, we identify related work in sentiment analysis which could profit from approaches to noise reduction.

show abstract

Separating Actor-View from Speaker-View Opinion Expressions using Linguistic Features

Wiegand

Schulder

Ruppenhofer

2016

View full text Add to dashboard Cite

We examine different features and classifiers for the categorization of opinion words into actor and speaker view. To our knowledge, this is the first comprehensive work to address sentiment views on the word level taking into consideration opinion verbs, nouns and adjectives. We consider many high-level features requiring only few labeled training data. A detailed feature analysis produces linguistic insights into the nature of sentiment views. We also examine how far global constraints between different opinion words help to increase classification performance. Finally, we show that our (prior) word-level annotation correlates with contextual sentiment views.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Michael Wiegand

A Survey on Hate Speech Detection using Natural Language Processing

Inducing a Lexicon of Abusive Words – a Feature-Based Approach

Untitled

A survey of noise reduction methods for distant supervision

Separating Actor-View from Speaker-View Opinion Expressions using Linguistic Features

Contact Info

Product

Resources

About