The impact of indexing approaches on Arabic text classification

Al-Badarneh, Amer; Al-Shawakfa, Emad; Bani-Ismail, Basel; Al-Rababah, Khaleel; Shatnawi, Safwan

doi:10.1177/0165551515625030

Cited by 21 publications

(7 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The same results were obtained by Al-Badarneh et al [27] and Mustafa et al [43], which demonstrated that stem is a better choice to use in classifying Arabic text. Our obtained result is consistent with the finding of Liu and Zhang [45] which showed that the pre-processing steps like stemming improves SA accuracy.…”

Section: Resultssupporting

confidence: 82%

See 1 more Smart Citation

ASA: A framework for Arabic sentiment analysis

Oussous

Benjelloun

Lahcen

et al. 2019

Journal of Information Science

View full text Add to dashboard Cite

Sentiment analysis (SA), also known as opinion mining, is a growing important research area. Generally, it helps to automatically determine if a text expresses a positive, negative or neutral sentiment. It enables to mine the huge increasing resources of shared opinions such as social networks, review sites and blogs. In fact, SA is used by many fields and for various languages such as English and Arabic. However, since Arabic is a highly inflectional and derivational language, it raises many challenges. In fact, SA of Arabic text should handle such complex morphology. To better handle these challenges, we decided to provide the research community and Arabic users with a new efficient framework for Arabic Sentiment Analysis (ASA). Our primary goal is to improve the performance of ASA by exploiting deep learning while varying the preprocessing techniques. For that, we implement and evaluate two deep learning models namely convolutional neural network (CNN) and long short-term memory (LSTM) models. The framework offers various preprocessing techniques for ASA (including stemming, normalisation, tokenization and stop words). As a result of this work, we first provide a new rich and publicly available Arabic corpus called Moroccan Sentiment Analysis Corpus (MSAC). Second, the proposed framework demonstrates improvement in ASA. In fact, the experimental results prove that deep learning models have a better performance for ASA than classical approaches (support vector machines, naive Bayes classifiers and maximum entropy). They also show the key role of morphological features in Arabic Natural Language Processing (NLP).

show abstract

Section: Resultssupporting

confidence: 82%

“…Therefore, various research works may differ about the efficiency of the same stemmers. In addition, while some articles showed that Khoja stemmer is less efficient [20], other articles confirmed that Khoja has a good performance [26,27]. For this reason, we considered Khoja in our research.…”

Section: Related Workmentioning

confidence: 99%

ASA: A framework for Arabic sentiment analysis

Oussous

Benjelloun

Lahcen

et al. 2019

Journal of Information Science

View full text Add to dashboard Cite

show abstract

“…Moreover, Hmeidi et al [12] studied the influence of raw text, khoja root-based stemmer and light stemming of Arabic text documents based on standard classifiers, such as NB, SVM, KNN, J48 and Decision Table classifiers. The results exhibited that the SVM and NB classifiers with light stemming provides better classification accuracy than other classifiers.The same conclusion was drawn up by Al-Badarneh [13] and Ayedh et al [14] by using various pre-processing methods. Additionally, Al-Molegi et al [15] and Khreisat [16] have proposed an approach to classify Arabic text documents based on the combination of N-grams with some similarity measures, including Manhattan, Euclidean distances and Dice.…”

Section: Related Worksupporting

confidence: 76%

Untitled

2017

IJAIA

View full text Add to dashboard Cite

show abstract

“…The authors have assessed their framework against [13] and [14]; the outcome of this comparison demonstrated the effectiveness of their proposed framework, it outperformed other approaches in the Normalized Discounted Cumulative Gain (NDCG) and precision evaluation measures. Further, the work of Al-Badarneh et al [15] investigated the impact of using different indexing techniques (full-word, stem, and root) when classifying Arabic text. It concludes that using 'fullword' or 'stem' outperforms 'root' when applied with the Naïve Bayes (NB) classifier.…”

Section: A Colloquial Arabic Text Classificationmentioning

confidence: 99%