2020
DOI: 10.1109/access.2020.3009217
|View full text |Cite
|
Sign up to set email alerts
|

Impact of Stemming and Word Embedding on Deep Learning-Based Arabic Text Categorization

Abstract: Document classification is a classical problem in information retrieval, and plays an important role in a variety of applications. Automatic document classification can be defined as content-based assignment of one or more predefined categories to documents. Many algorithms have been proposed and implemented to solve this problem in general, however, classifying Arabic documents is lagging behind similar works in other languages. In this paper, we present seven deep learning-based algorithms to classify the Ar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
34
0
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 73 publications
(47 citation statements)
references
References 45 publications
0
34
0
2
Order By: Relevance
“…Preprocessing is a key task in semantic text similarity process. Stemming is an important technique adopted for preprocessing texts due to the fact that it reduces feature space and improves performance of the similarity process ( Alhaj et al, 2019 ; Almuzaini & Azmi, 2020 ).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Preprocessing is a key task in semantic text similarity process. Stemming is an important technique adopted for preprocessing texts due to the fact that it reduces feature space and improves performance of the similarity process ( Alhaj et al, 2019 ; Almuzaini & Azmi, 2020 ).…”
Section: Related Workmentioning
confidence: 99%
“…Stemming effect has been studied and applied to different domains of NLP and computation linguistics. This includes document categorization ( Alhaj et al, 2019 ; Almuzaini & Azmi, 2020 ), information retrieval ( Zeroual & Lakhouaja, 2017 ; Alnaied, Elbendak & Bulbul, 2020 ), automatic essay scoring ( Al-Shalabi, 2016 ), and sentiment analysis ( Al-Saqqa, Awajan & Ghoul, 2019 ). In all these studies it has been reported that stemming and lemmatization improves the performance of the resulted models.…”
Section: Related Workmentioning
confidence: 99%
“…Pengujian dilakukan dengan mengolah atribut judul tersebut melalui proses stemming biasa dan proses modifikasi stemming. Kedua proses tersebut akan menghasilkan perbandingan nilai recall pada judul [10], [11].…”
Section: P-issn: 2621-8070 E-issn:2686-3219unclassified
“…The stemming rather reduces the information gained from the data in many languages. In fact, the stemming improves accuracy (ACC [28]) achieved by various methods in different languages including not only English [29], [30] but also Arabic [26], [27], [31], [32], Indonesian [23], [33], [34], Japanese [25], [35] French [36]- [38], Portuguese [37], [39], German [37], [40], [41], Hungarian [37], [42], [43], Spanish [44]- [47], and Turkish [48]- [50].…”
Section: Introductionmentioning
confidence: 99%