2023
DOI: 10.3390/app132111875
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Vulgar Word Extraction Method with Application to Vulgar Remark Detection in Chittagonian Dialect of Bangla

Tanjim Mahmud,
Michal Ptaszynski,
Fumito Masui

Abstract: The proliferation of the internet, especially on social media platforms, has amplified the prevalence of cyberbullying and harassment. Addressing this issue involves harnessing natural language processing (NLP) and machine learning (ML) techniques for the automatic detection of harmful content. However, these methods encounter challenges when applied to low-resource languages like the Chittagonian dialect of Bangla. This study compares two approaches for identifying offensive language containing vulgar remarks… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 24 publications
(3 citation statements)
references
References 61 publications
0
2
0
Order By: Relevance
“…Cyberbullying detection has gained considerable attention in recent years owing to the widespread use of social media platforms and online communication channels [7]. Researchers have explored various techniques and methodologies to identify and address cyberbullying among various languages and cultures [7,8,[31][32][33][34][35]. However, while numerous research efforts have introduced solutions to detect cyberbullying in high-resource languages such as English or Japanese, there is a limited number of studies that have extensively addressed cyberbullying detection in the low-resource languages, such as the Bangla language.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Cyberbullying detection has gained considerable attention in recent years owing to the widespread use of social media platforms and online communication channels [7]. Researchers have explored various techniques and methodologies to identify and address cyberbullying among various languages and cultures [7,8,[31][32][33][34][35]. However, while numerous research efforts have introduced solutions to detect cyberbullying in high-resource languages such as English or Japanese, there is a limited number of studies that have extensively addressed cyberbullying detection in the low-resource languages, such as the Bangla language.…”
Section: Related Workmentioning
confidence: 99%
“…Chittagonian, spoken by approximately 13 million people, adds to Bangladesh's rich linguistic tapestry [5]. The widespread use of Unicode across communication devices has enabled individuals, including Chittagonian speakers, to freely express themselves in their native tongues [7,8]. Social media platforms like Facebook, imo, WhatsApp, or various blog services have been embraced by the people of Chittagong, thereby fostering an environment conducive to uninhibited self-expression [9][10][11].…”
Section: Introductionmentioning
confidence: 99%
“…There is no objective and scientific way of identifying when to use the words 'language' and 'dialect' (Melinger, 2018;Sherif et al, 2023). Distinguishing languages from dialects depends on at least three factors: mutual intelligibility, the speakers' culture or opinion, and political status (De la Torre & Gonong, 2020; Mahmud et al, 2023). The number of linguists who have exceeded those 200 or so essential vocabulary items has increased in recent years, thereby increasing the repertoire of forms that comparativists have to work with.…”
Section: Introductionmentioning
confidence: 99%