Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.470
|View full text |Cite
|
Sign up to set email alerts
|

Multilingual Offensive Language Identification with Cross-lingual Embeddings

Abstract: Offensive content is pervasive in social media and a reason for concern to companies and government organizations. Several studies have been recently published investigating methods to detect the various forms of such content (e.g. hate speech, cyberbulling, and cyberaggression). The clear majority of these studies deal with English partially because most annotated datasets available contain English data. In this paper, we take advantage of English data available by applying cross-lingual contextual word embed… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
67
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
2

Relationship

1
9

Authors

Journals

citations
Cited by 96 publications
(77 citation statements)
references
References 45 publications
1
67
0
Order By: Relevance
“…[31] employ a transfer learning approach using BERT for hate speech detection. [35] use cross-lingual embeddings to identify offensive content in multilingual setting. Our multilingual approach is similar in spirit to the method proposed in [34] which use the same model architecture and aligned word embedding to solve the tasks.…”
Section: Related Workmentioning
confidence: 99%
“…[31] employ a transfer learning approach using BERT for hate speech detection. [35] use cross-lingual embeddings to identify offensive content in multilingual setting. Our multilingual approach is similar in spirit to the method proposed in [34] which use the same model architecture and aligned word embedding to solve the tasks.…”
Section: Related Workmentioning
confidence: 99%
“…Pre-trained models on a language can be used for many tasks with further fine-tuning and training. On social media, many users communicate in mixed languages [166]. For example, in Asian countries like India and Pakistan, people mix English with the Urdu language.…”
Section: Handling Of a Dynamic Corpusmentioning
confidence: 99%
“…Joint training of universal encoders has led to enormous progress on standard benchmarks and industrial applications such as (Ranasinghe and Zampieri, 2020;Gencoglu, 2020).…”
Section: Continual Learningmentioning
confidence: 99%