This paper presents a Convolutional Neural Network CNN Models to classify Arabic sentences into three topics. These sentences are derived from Essex Arabic Summaries Corpus (EASC) corpus, tokenized to words and transformed to sequences of word indices. All sequences are padded to be in the same length. The models of Convolution Neural Network are built on top of word embedding layer. The word embedding layer is either pre-trained or jointed into the model. Dropout and l2 weight regularization are used to overcome the overfitting during training. The CNN models achieve high performance in accuracy for Arabic sentences classification. General TermsNatural Language Processing, Deep Learning KeywordsClassification, Convolutional neural network, Word
In this work, a new system for Arabic letter recognition is designed and implemented. New approaches for segmentation, processing, classification and hence recognition of characters and scripts are shown. The research concentrates on two important subjects: First, segmentation on the basis of word histogram and baseline estimation -a convenient algorithm is worked out for this aim. Second, the process of feature extraction to find the most useful points is implemented upon the given algorithm. Feature coding is executed as a string of eight digits through two counterclockwise passes. The code is filtered up provided with eight basic pairs. The filtered code goes through processing to form an array of 9*9 elements, in addition to an array of 2*2 elements determined to resemble the four parts of the extracted character image. The 85 obtained elements are the input to a Backpropagation Neural Network used for classification purposes. A 98.7% rate of recognition is achieved for Arabic character classification. Results have proved high recognition of Arabic letters for varieties of fonts and sizes. They have also assured that computing time is negligible with very small errors.
In this paper, we present a new technique for abstractive summarization of Arabic texts. a system of knowledge base and fuzzy logic has been designed and implemented to simulate human ability of understanding the content of Arabic text, and to create abstractive summary for this text. The knowledge base has been designed for financial and economic field. It consists of facts and if-then rules. The sentences have been parsed by previous stage. The sentences of summary have been obtained using knowledge based system, then Fuzzy system has been designed for selecting appropriate summary of sentences. General membership function has been designed to obtain all the mathematical shapes of membership functions. The peak of the membership function has been designed for hierarchy relations of concepts, and for the destination of semantic relations. The edges of the function has been designed for semantic relations of concepts, and for the domain of semantic relations. The system has been tested on texts for different subjects. The texts have been taken from EASC University Corpus (Essex Arabic Summaries Corpus). The results of this research have shown the effectiveness of the novel hybrid system in terms of semantic, meaning and right composition.
In this paper, the named entity recognition system is built using morphological, lexical and semantic analysis. Rule based system is designed for template mining from the Arabic text. Arabic texts are selected from oil production domain. They are taken from Arabic BBC, RT and CNN websites. The System is tested on these texts and the results give high performance, less error made and good accuracy in finding the templates from texts according to named entities extracted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.