Toward Robust Arabic AI-Generated Text Detection: Tackling Diacritics Challenges
Hamed Alshammari,
Khaled Elleithy
Abstract:Current AI detection systems often struggle to distinguish between Arabic human-written text (HWT) and AI-generated text (AIGT) due to the small marks present above and below the Arabic text called diacritics. This study introduces robust Arabic text detection models using Transformer-based pre-trained models, specifically AraELECTRA, AraBERT, XLM-R, and mBERT. Our primary goal is to detect AIGTs in essays and overcome the challenges posed by the diacritics that usually appear in Arabic religious texts. We cre… Show more
Set email alert for when this publication receives citations?
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.