2019 18th IEEE International Conference on Machine Learning and Applications (ICMLA) 2019
DOI: 10.1109/icmla.2019.00104
|View full text |Cite
|
Sign up to set email alerts
|

Hateful Speech Detection in Public Facebook Pages for the Bengali Language

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
23
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
3
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 61 publications
(24 citation statements)
references
References 10 publications
0
23
0
Order By: Relevance
“…Nevertheless, with the advances in multilingual parsers and deep learning technology, together with increasing pressures from policy-makers to handle hate speech issues at local resources, non-English HS detection toolkits have seen a steady increase. The figure indicates that about 51% of all works in this field are performed on English dataset, with an increase of proportion of other languages as well where Arabic (13% ) [93,59,12,143], Turkish (6%) [143,104], Greek (4%) [143,6,136], Danish (5%) [106,143], Hindi (4%) [121,22,88], German (4% ) [72,120], Malayalam (3%) [130,109], Tamil (3%) [130,20], Chinese (1%) [138,139,155], Italian (2%) [116], Urdu (1%) [126,95,7], Russian(1%) [17], Bengali (1% ) [62,127,69], Korean (1%) [91], French (1%) [16,102,50], Indonesian (1%) [14], Portuguese (1%) [14], Spanish (1%) [56] and Polish (1%) [118] seem to dominate the rest of the languages in this field.…”
Section: Statistical Trends Of Resultsmentioning
confidence: 99%
“…Nevertheless, with the advances in multilingual parsers and deep learning technology, together with increasing pressures from policy-makers to handle hate speech issues at local resources, non-English HS detection toolkits have seen a steady increase. The figure indicates that about 51% of all works in this field are performed on English dataset, with an increase of proportion of other languages as well where Arabic (13% ) [93,59,12,143], Turkish (6%) [143,104], Greek (4%) [143,6,136], Danish (5%) [106,143], Hindi (4%) [121,22,88], German (4% ) [72,120], Malayalam (3%) [130,109], Tamil (3%) [130,20], Chinese (1%) [138,139,155], Italian (2%) [116], Urdu (1%) [126,95,7], Russian(1%) [17], Bengali (1% ) [62,127,69], Korean (1%) [91], French (1%) [16,102,50], Indonesian (1%) [14], Portuguese (1%) [14], Spanish (1%) [56] and Polish (1%) [118] seem to dominate the rest of the languages in this field.…”
Section: Statistical Trends Of Resultsmentioning
confidence: 99%
“…The main challenge is the lack of sufficient data. To the best of our knowledge, many of the datasets were around 5000 corpora [5], [8] and [12]. There was a publicly available corpus containing around 10000 corpora, which were annotated into five different classes [2].…”
Section: Literature Reviewmentioning
confidence: 99%
“…In Bengali, several works investigated the presence of abusive language in social media data by leveraging supervised ML classifiers and labeled data (Ishmam & Sharmin, 2019;Banik & Rahman, 2019). Sazzed (2021) annotated 3,000 transliterated Bengali comments into two classes, abusive and non-abusive, 1,500 comments for each.…”
Section: Related Workmentioning
confidence: 99%