The Natural Language Processing (NLP) is a process to automate the text or speech of Natural Languages. This automation is mainly conducted for Western languages. The Arabic Language got less focus in this area. This paper presents a Model to recognize an Arabic sentence. A new morphological model based on regular expressions is developed to recognize the Arabic verbs. A hash table containing all Arabic three-letters' root of verbs is implemented. The total number of Arabic verbs that are derived from three-letters' root size is 23090. The number of roots is 6104. A set of rules forming the Arabic grammar is used to derive and analyze the syntax of Arabic sentences. About 87% of the verbs represented in our regular expressions' engine are detected. Moreover, the sentences are also recognized. In several Surat of the Quran, only 9% of the detected verbs are false-positive (a non-verb declared as a verb), and 4% are considered false-negative (a verb is considered as a noun). This rate is mainly because we are not using vowels even that the Quran (our case study) is using them. The reason behind our decision is to be able to handle all Arabic texts, which mostly are not using vowels.
<div style="’text-align: justify;">Mining in data is an important step for knowledge discovery, which leads to extract new patterns from datasets. It is a widespread methodology that has the capability to help ministries, companies, and experts for diving into the data to find important insights and patterns to help them take suitable decisions. The farmers and marketers of the date product in the production regions lack to discover the most important characteristics of dates types from the economically, healthy, and the type of consumers point of view to achieve the highest profits by choosing the best types and the most consumed. The research objective is to extract interesting patterns from the dates’ product dataset, using Machine Learning, based on association rules generation. This, in turn, will support the farmers, and marketers to discover new features related to the production, consumption, and marketing processes. This research used a real dataset collected from KSA, Qassim region, which is the first region of cultivation of palm, that produces the best types of dates in the Arab region. The data preprocessed and analyzed by the Apriori algorithm. The results show important features and insights related to the health benefits of dates, production, its consumption, consumers types, and marketing. Consequently, these results can be employed, for instance, to encourage individuals to consume dates for their nutritional value and their important health benefits., furthermore, the results encourage producers to focus on the production of preferable types and to improve the marketing policies of the other types.</div>
<p><span>Arab users of social media have significantly increased, thus increasing the opportunities for extracting knowledge from various areas of life such as trade, education, psychological health services, etc. The active Arab presence on Twitter motivates many researchers to classify and analysis Arabic tweets from numerous aspects. This study aimed to explore the best performance scenarios in the classification of emotions conveyed through Arabic tweets. Hence, various experiments were conducted to investigate the effects of feature extraction techniques and the N-gram model on the performance of three supervised machine learning algorithms, which are Support Vector Machine (SVM), Naïve Bayes (NB), and Logistic Regression (LR). The general method of the experiments was based on five steps; data collection, preprocessing, feature extraction, emotion classification, and evaluation of results. To implement these experiments, a real-world Twitter dataset was gathered. The best result achieved by the SVM classifier when using a bag of words (BoW) weighting schema (with unigrams and bigrams or with unigrams, bigrams, and trigrams) exceeded the best performance results of other algorithms.</span></p>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.