Abstract:Educational Data Mining (EDM) is used to extract and discover interesting patterns from educational institution datasets using Machine Learning (ML) algorithms. There is much academic information related to students available. Therefore, it is helpful to apply data mining to extract factors affecting students’ academic performance. In this paper, a web-based system for predicting academic performance and identifying students at risk of failure through academic and demographic factors is developed. The ML model… Show more
“…Alboaneen et al [22] developed a web-based system to predict student performance using five regression-ML techniques. SVM, RF, LR, ANN, and KNN were applied to a dataset comprising of 10 features for 168 students.…”
Section: ) Predicting the Performance Of Studentsmentioning
The science of data mining has contributed considerably to the education sector. However, most educational data mining (EDM) studies have focused on predicting the future performance of students and detecting at-risk students to provide early targeted interventions. A few studies used machine learning techniques to predict the future academic pathways for degree students. However, there are limited studies that use the datasets of high school students to predict future pathways. Moreover, such studies are yet to be conducted using datasets produced in Saudi high schools. Therefore, researchers that work in the education sector have focused on EDM and are eager to apply advanced computer science methods to upgrade old administrative systems. Furthermore, the education sector is a rich field with data that can be exploited to improve and enhance educational management systems and accelerate digital transformation. In this study, we explore the applications of EDM and review the algorithms used by other researchers in this field. We applied supervised machine learning classifiers to educational datasets collected from high schools in Saudi to predict the future academic pathways of students and identify the essential factors that affect them. This study contributes to the literature by developing a predictive model for students in Saudi high schools and detecting critical features that affect future academic careers of the students. Furthermore, we used explainable artificial intelligence to interpret the best model and enhance its transparency.
“…Alboaneen et al [22] developed a web-based system to predict student performance using five regression-ML techniques. SVM, RF, LR, ANN, and KNN were applied to a dataset comprising of 10 features for 168 students.…”
Section: ) Predicting the Performance Of Studentsmentioning
The science of data mining has contributed considerably to the education sector. However, most educational data mining (EDM) studies have focused on predicting the future performance of students and detecting at-risk students to provide early targeted interventions. A few studies used machine learning techniques to predict the future academic pathways for degree students. However, there are limited studies that use the datasets of high school students to predict future pathways. Moreover, such studies are yet to be conducted using datasets produced in Saudi high schools. Therefore, researchers that work in the education sector have focused on EDM and are eager to apply advanced computer science methods to upgrade old administrative systems. Furthermore, the education sector is a rich field with data that can be exploited to improve and enhance educational management systems and accelerate digital transformation. In this study, we explore the applications of EDM and review the algorithms used by other researchers in this field. We applied supervised machine learning classifiers to educational datasets collected from high schools in Saudi to predict the future academic pathways of students and identify the essential factors that affect them. This study contributes to the literature by developing a predictive model for students in Saudi high schools and detecting critical features that affect future academic careers of the students. Furthermore, we used explainable artificial intelligence to interpret the best model and enhance its transparency.
“…Researchers have shown a growing interest in the potential of neural networks, and studies in this area continue to expand our understanding of their capabilities and limitations [39,32]. A neural network is a type of algorithm used for supervised learning, which is commonly implemented for classification and prediction purposes [4]. Neurons constitute the fundamental components of Artificial Neural Networks (ANNs) that enable the mapping from different input layers to output layers.…”
Section: Artificial Neural Network For Classificationmentioning
The features present in large datasets significantly affect the performance of machine learning models. Redundant and irrelevant features will be rejected and cause a decrease in machine learning model performance. This paper proposes HyFeS-ROS-ANN: Hybrid Feature Selection and Resampling combination method for binary classification using artificial neural network multilayer perceptron (MLP). The first stage of this approach is to use a combination of two feature selection methods to select essential features that are highly correlated with model performance. The second stage of this approach is to use a combination of resampling methods to handle unbalanced data classes. Both approaches are applied to the academic performance classification model using the MLP neural network. This research dataset is obtained using three-dimensional (3D) frameworks such as the Big Five Personality to determine the Personality that affects academic performance from the student dimension, the Family Influence Scale (FIS), which measures factors that affect academic performance from the family dimension, and Higher Education Institutions Service Quality (HEISQUAL) to measure service quality and its influence on academic performance from the Education institution dimension. Previous research shows that the CoR-ANN algorithm has a model accuracy rate of 94%. The research results based on the dataset show that our proposed method can improve accuracy by selecting more relevant and essential features in improving model performance. The results show that the features are reduced from 135 to 108, while the HyFS-ROS-ANN model for binary classification accuracy increases to 100%.
“…The cumulative grade point average (CGPA) is the most popular measure of academic success [11], followed by coursework marks [12]. Therefore, universities should constantly monitor and analyze students' academic performance to alert and support students performing poorly or experiencing declining performance [13].…”
Recent changes in the labor market and higher education sector have made graduates' employability a priority for researchers, governments, and employers in developed and emerging nations. There is, however, still a dearth of study about whether graduate students acquire the employability skills that businesses want of them because of their higher education. To determine a student's future employment and career path, it is critical to evaluate their soft skills. An emerging area called educational data mining (EDM) aims to gather enormous volumes of academic data produced and maintained by educational institutions and to derive explicit and specific information from it. This paper aims to predict students' soft skills such as professional, analytical, linguistic, communication, and ethical skills, based on their socio-economic, academic, and institutional data by leveraging data mining methods and machine learning techniques. All five soft skills were predicted using prediction models created using linear regression, probabilistic neural networks, and simple regression tree techniques. This study used a dataset from an open source that Universidad Technologica de Bolivar published. It covers academic, social, and economic data for 12,411 students. The experimental results demonstrated that the linear regression algorithm performed better than the others in predicting all five soft skills compared to machine learning methods. This finding can assist higher education institutions in making informed decisions, providing tailored support, enhancing student success and employability, and continuously modifying their programs to meet the needs of students.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.