The effects of Pre-Processing Techniques on Arabic Text Classification

doi:10.30534/ijatcse/2021/061012021

Cited by 20 publications

(9 citation statements)

References 32 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These stemmers are ARLSTem v1.0 [16], Tashaphyne, integrated system of rice intensification (ISRI) stemmer [17], and the stemmer included in Madamira [18]. c) Lemmatization: Lemmatization has recently proved to be beneficial for Arabic text classifiers [19]- [21].…”

Section: Text Processing Techniquesmentioning

confidence: 99%

See 1 more Smart Citation

Arabic authorship attribution on Twitter: what is really matters?

Kah¹,

airej

Zeroual

2022

IJEECS

View full text Add to dashboard Cite

Recently, authorship attribution (AA) of online social networks texts has gained more attention. However, since 2015, when the first work that addressed the AA of Arabic tweets was published, we found that nothing much has been done after that. Thus, the current paper presents an extensive study that investigates the effects of various factors on the AA of Arabic short-texts, especially tweets. This led to a proposed architecture in which the AA accuracy is examined depending on the size of the training dataset, the number of classes covered, the text processing techniques applied, the methods used for both feature selection and extraction, and finally, the classifier implemented. As a result, we performed 792 different tests. The highest accuracy recorded is 97.4%, and it is among the best results published so far.

show abstract

Section: Text Processing Techniquesmentioning

confidence: 99%

“…Selecting the appropriate linguistic feature that can represent the original text is still intriguing researchers working on Arabic text classification [19], [20]. Therefore, we used 10 different linguistic features to represent the tweets' text.…”

Section: Effect Of Text Processing Techniquesmentioning

confidence: 99%

Arabic authorship attribution on Twitter: what is really matters?

Kah¹,

airej

Zeroual

2022

IJEECS

View full text Add to dashboard Cite

show abstract

“…Removing stop words reduces the storage spaces required to store identified tokens. Not to mention, many studies showed that removing Stop-words improve the efficiency and effectiveness of Arabic IR systems [5], [11].…”

Section: Stoppingmentioning

confidence: 99%

“…Lemmatization takes a more complex approach in text processing; It aims to regroup semantically related words, and it is proved to be beneficial in the areas of Arabic information retrieval [11], [14]. However, in Arabic, the use of lemmatization is more difficult task due to the morphological complexity of the language itself, and the absence of short vowels in most existing Arabic documents [15].…”

Section: Lemmatizationmentioning

confidence: 99%

Review on Recent Arabic Information Retrieval Techniques

AARAB

Oussous

Saddoune

2022

EAI Endorsed Trans IoT

View full text Add to dashboard Cite

Information retrieval is an important field that aims to provide a relevant document to a user information need, expressed through a query. Arabic is a challenging language that gained much attention recently in the information retrieval domain. To overcome the problems related to its complexity, many studies and techniques have been presented, most of them were conducted to solve the stemming problem. This paper presents an overview of the Arabic information retrieval process, including various text processing techniques, ranking approaches, evaluation measures, and some important information retrieval models. The paper finally presents some recent related studies and approaches in different Arabic information retrieval fields.

show abstract

“…Documents in the bag of words containing no words were removed as well as their entry labels. A support vector machines (SVM)-based supervised classification model was used using the word frequency counts from the bag-of-words model and the labels [21], [22]. A multiclass linear classifier specifies the counts of the bag-of-words model to be the predictor, and the event type labels to be the response.…”

Section: Thematic Classification and Machine Learningmentioning

confidence: 99%

A Data Analytic Approach in the Thematic Classification of the Reasons and Perspectives of Adolescents’ Social Media Engagement

2021

IJATCSE

View full text Add to dashboard Cite

Social media is one of the leading platforms where people and organizations meet. With the technological advancements in the world wide web, smart mobile devices and Internet connectivity, an increase in social media engagement is highly observable. People of all ages, one way or the other, has a social media account and their perspective, reasons, and content-preferences vary. In this study, the experience of social media engagement from the perspectiveof young people is analyzed and thematically classified using a data analytic approach which focuses on natural language processing (NLP).Results show that the social media engagement experience of the respondents reflects what social media is tothem and for themexpressed by their reasons and perspectives, respectively. The reasons of their engagement are basically connected to the contents and features of social media platforms that suit their purpose, intention, and goals of engagementexpressed by textual response and analyzed by a learning algorithm that fits multiclass models for support vector machines(SVM). The profound social media engagement of the youth leads to a wide spectrum of responses and behaviors that would affect their mental healthas defined by their reasons and perspectives

show abstract

The effects of Pre-Processing Techniques on Arabic Text Classification

Cited by 20 publications

References 32 publications

Arabic authorship attribution on Twitter: what is really matters?

Arabic authorship attribution on Twitter: what is really matters?

Review on Recent Arabic Information Retrieval Techniques

A Data Analytic Approach in the Thematic Classification of the Reasons and Perspectives of Adolescents’ Social Media Engagement

Contact Info

Product

Resources

About