“…While many studies rely on established benchmark datasets like Enron [20], C50 [7], PAN [22], IMDb62 [6,21] and others [9], the scarcity of standard datasets, particularly for low-resource languages, presents a unique challenge. Creating specialized corpora has paved the way for promising advancements in the field, demonstrated by projects like UNAAC [5], BAAD [2], UrduCorpus [5], A3C Corpus [8,25], and more [4]. These corpora tailored for AA contribute significantly to the field, expanding its resources.…”