Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection

Waseem, Zeerak; Thorne, James; Bingel, Joachim

doi:10.1007/978-3-319-78583-7_3

Cited by 68 publications

(91 citation statements)

References 37 publications

(13 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Table 2 summarized the obtained results for fine-tuning strategies along with the official baselines. We use Waseem and Hovy [22], Davidson et al [3], and Waseem et al [23] as baselines and compare the results with our different fine-tuning strategies using pre-trained BERT base model. The evaluation results are reported on the test dataset and on three different metrics: precision, recall, and weighted-average F1-score.…”

Section: Implementation and Results Analysismentioning

confidence: 99%

“…Zhang et al [25] used a CNN+GRU (Gated Recurrent Unit network) neural network model initialized with pre-trained word2vec embeddings to capture both word/character combinations (e. g., n-grams, phrases) and word/character dependencies (order information). Waseem et al [23] brought a new insight to hate speech and abusive language detection tasks by proposing a multi-task learning framework to deal with datasets across different annotation schemes, labels, or geographic and cultural influences from data sampling. Founta et al [7] built a unified classification model that can efficiently handle different types of abusive language such as cyberbullying, hate, sarcasm, etc.…”

Section: Previous Workmentioning

confidence: 99%

“…To extend this dataset, Waseem [20] also provided another dataset containing 6.9k of tweets annotated with both expert and crowdsourcing users as racism, sexism, neither, or both. Since both datasets are overlapped partially and they used the same strategy in definition of hateful content, we merged these two datasets following Waseem et al [23] to make our imbalance data a bit larger. Davidson et al [3] used the Twitter API to accumulate 84.4 million tweets from 33,458 twitter users containing particular terms from a pre-defined lexicon of hate speech words and phrases, called Hatebased.org.…”

Section: Dataset Descriptionmentioning

confidence: 99%

“…To detect online hate speech, a large number of scientific studies have been dedicated by using Natural Language Processing (NLP) in combination with Machine Learning (ML) and Deep Learning (DL) methods [1,8,11,13,22,25]. Although supervised machine learning-based approaches have used different text mining-based features such as surface features, sentiment analysis, lexical resources, linguistic features, knowledge-based features or user-based and platformbased metadata [3,6,23], they necessitate a well-defined feature extraction approach. The trend now seems to be changing direction, with deep learning models being used for both feature extraction and the training of classifiers.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

Mozafari

Farahbakhsh

Crespi

2019

Studies in Computational Intelligence

247

173

View full text Add to dashboard Cite

Generated hateful and toxic content by a portion of users in social media is a rising phenomenon that motivated researchers to dedicate substantial efforts to the challenging direction of hateful content identification. We not only need an efficient automatic hate speech detection model based on advanced machine learning and natural language processing, but also a sufficiently large amount of annotated data to train a model. The lack of a sufficient amount of labelled hate speech data, along with the existing biases, has been the main issue in this domain of research. To address these needs, in this study we introduce a novel transfer learning approach based on an existing pre-trained language model called BERT (Bidirectional Encoder Representations from Transformers). More specifically, we investigate the ability of BERT at capturing hateful context within social media content by using new finetuning methods based on transfer learning. To evaluate our proposed approach, we use two publicly available datasets that have been annotated for racism, sexism, hate, or offensive content on Twitter. The results show that our solution obtains considerable performance on these datasets in terms of precision and recall in comparison to existing approaches. Consequently, our model can capture some biases in data annotation and collection process and can potentially lead us to a more accurate model.

show abstract

Section: Implementation and Results Analysismentioning

confidence: 99%

Section: Previous Workmentioning

confidence: 99%

Section: Dataset Descriptionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

Mozafari

Farahbakhsh

Crespi

2019

Studies in Computational Intelligence

247

173

View full text Add to dashboard Cite

show abstract

“…We expect that this result occurred for two reasons. First, the dataset contains a large number of cases where AAE is used (Waseem et al, 2018). Second, many of the AAE tweets also use words like "n*gga" and "b*tch", and are thus frequently associated with the hate speech and offensive classes, resulting in "false positive bias" (Dixon et al, 2018).…”

Section: Discussionmentioning

confidence: 99%

Racial Bias in Hate Speech and Abusive Language Detection Datasets

Davidson¹,

Bhattacharya²,

Weber³

2019

Proceedings of the Third Workshop on Abusive Language Online

319

261

View full text Add to dashboard Cite

Technologies for abusive language detection are being developed and applied with little consideration of their potential biases. We examine racial bias in five different sets of Twitter data annotated for hate speech and abusive language. We train classifiers on these datasets and compare the predictions of these classifiers on tweets written in African-American English with those written in Standard American English. The results show evidence of systematic racial bias in all datasets, as classifiers trained on them tend to predict that tweets written in African-American English are abusive at substantially higher rates. If these abusive language detection systems are used in the field they will therefore have a disproportionate negative impact on African-American social media users. Consequently, these systems may discriminate against the groups who are often the targets of the abuse we are trying to detect.

show abstract

On Transfer Learning for Detecting Abusive Language Online

Uban

Dinu

2019

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection

Cited by 68 publications

References 37 publications

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media

Racial Bias in Hate Speech and Abusive Language Detection Datasets

On Transfer Learning for Detecting Abusive Language Online

Contact Info

Product

Resources

About