Natural language processing has witnessed remarkable progress with the advent of deep learning techniques. Text summarization, along other tasks like text translation and sentiment analysis, used deep neural network models to enhance results. The new methods of text summarization are subject to a sequence-to-sequence framework of encoder–decoder model, which is composed of neural networks trained jointly on both input and output. Deep neural networks take advantage of big datasets to improve their results. These networks are supported by the attention mechanism, which can deal with long texts more efficiently by identifying focus points in the text. They are also supported by the copy mechanism that allows the model to copy words from the source to the summary directly. In this research, we are re-implementing the basic summarization model that applies the sequence-to-sequence framework on the Arabic language, which has not witnessed the employment of this model in the text summarization before. Initially, we build an Arabic data set of summarized article headlines. This data set consists of approximately 300 thousand entries, each consisting of an article introduction and the headline corresponding to this introduction. We then apply baseline summarization models to the previous data set and compare the results using the ROUGE scale.
An amendment to this paper has been published and can be accessed via the original article.
Dictionaries are very essential resources that almost all Natural Language Processing (NLP) applications use. Since language is constantly evolving, new words or new meanings to current words continuously appear. In order to keep a dictionary up-to-date, an enrichment process is needed to incorporate new vocabularies. In the last decade, a new approach of resources construction has emerged based on the collaboration between different users on the Web. In this paper, we present the Interactive Arabic Dictionary (IAD): a monolingual web-based dictionary. Initially based on the "Almuajam Alwasseet" dictionary, IAD provides the different meanings of Arabic words, with specific morphological and syntactical information, in addition to other related information such as example sentences, multimedia illustrations, associated words, semantic domains, expressions, linguistic avails, common mistakes. Authorized users can collaboratively enrich the content of the dictionary through the use of a "controlled process" to add or modify entries, meanings, or any kind of detailed information related to them. This "controlled process" consists of a suggestion-validation procedure in order to maintain the integrity of the dictionary. This enrichment process will expand the dictionary content, allowing its future exploitation in high level NLP applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.