This paper proposes an efficient, Chi-Square-based, feature selection method for Arabic text classification. In Data Mining, feature selection is a preprocessing step that can improve the classification performance. Although few works have studied the effect of feature selection methods on Arabic text classification, limited number of methods was compared. Furthermore, different datasets were used by different works. This paper improves the previous works in three aspects. First, it proposes a new efficient feature selection method for enhancing Arabic text classification. Second, it compares extended number of existing feature selection methods. Third, it adopts two publicly available datasets to encourage future works to adopt them in order to guarantee fair comparisons among the various works. Our experiments show that our proposed method outperformed the existing methods in term of accuracy.
Big Data courses in which students are asked to carry out Big Data projects are becoming more frequent as a part of University Engineering curriculum. In these courses, instructors and students must face a series of special characteristics, difficulties and challenges that it is important to know about beforehand, so the lecturer can better plan the subject and manage the teaching methods in order to prevent students' academic dropout and low performance. The goal of this research is to approach this problem by sharing the lessons learned in the process of teaching e-learning courses where students are required to develop a Big Data project as a part of a final degree/master course. In order to do so, a survey was carried out among a group of students enrolled in those kinds of courses during the last years. The quantitative and qualitative analysis of the obtained data led us to present a series of lessons learned that may help other participants (both students and lecturers) to better study, design and teach similar courses. In addition, the results shed light on possible existing open problems in the area of Big Data project development. Both the methodology used and the survey designed in this research were validated by a group of experts in the area using a formal statistical approach at a significance level of p<0.008, which support the validity of the lessons learned.
Example-based transformational approaches to automate adaptive maintenance changes plays an important role in software research. One primary concern of those approaches is that a set of good qualified real examples of adaptive changes previously made in the history must be identified, or otherwise the adoption of such approaches will be put in question. Unfortunately, there is rarely enough detail to clearly direct transformation rule developers to overcome the barrier of finding qualified examples for adaptive changes. This work explores the histories of several open source systems to study the repetitiveness of adaptive changes in software evolution, and hence recognizing the source code change patterns that are strongly related with the adaptive maintenance. We collected the adaptive commits from the history of numerous open source systems, then we obtained the repetitiveness frequencies of source code changes based on the analysis of Abstract Syntax Tree (AST) edit actions within an adaptive commit. Using the prevalence of the most common adaptive changes, we suggested a set of change patterns that seem correlated with adaptive maintenance. It is observed that 76.93% of the undertaken adaptive changes were represented by 12 AST code differences. Moreover, only 9 change patterns covered 64.69% to 76.58% of the total adaptive change hunks in the examined projects. The most common individual patterns are related to initializing objects and method calls changes. A correlation analysis on examined projects shows that they have very similar frequencies of the patterns correlated with adaptive changes. The observed repeated adaptive changes could be useful examples for the construction of transformation approaches
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.