Higher Education plays a principal role in the changing and complex world of today, and there has been rapid growth in the scientific literature dedicated to predicting students' academic success or risk of dropout thanks to advances in Data Mining techniques. Degrees such as Computer Science in particular are in ever greater demand. Although the number of students has increased, the number graduating is still not enough to provide society with as many as it requires. This study contributes to reversing this situation by introducing an approach that not only predicts the dropout risk or students' performance but takes action to help both students and educational institutions. The focus is on maximizing graduation rates by constructing a Recommender System to assist students with their selection of subjects. In particular, the challenge is addressed of constructing reliable Recommender Systems on the basis of data which are both sparse and few in quantity, imbalanced, and anonymized, and which might have been stored under imperfect conditions. This approach is successfully applied to create a Recommender System using a realworld dataset from a public Spanish university containing performance data of a Computer Science degree course, demonstrating its successful application in real environments. The construction of a support system based on that approach is described, its results are evaluated, and its implications for students' academic achievement, and for institutions' graduation rates are discussed. Through the construction of this decision support system for students, we intend to increase the graduation rates and lower the dropout rate.
High levels of school dropout are a major burden on the educational and professional development of a country's inhabitants. A country's prosperity depends, among other factors, on its ability to produce higher education graduates capable of moving a country forward. To alleviate the dropout problem, more and more institutions are turning to the possibilities that artificial intelligence can provide to predict dropout as early as possible. The difficulty of accessing personal data and privacy issues that it entails force the institutions to rely on the Academic Data of their students to create accurate and reliable predictive systems. This work focuses on creating the best possible predictive model based solely on academic data, and accordingly, its capacity to infer knowledge must be maximised. Thus, Feature Engineering and Instance Engineering techniques such as dealing with redundancy, significance of the features, correlation, cardinality features, missing values, creation or elimination of features, data fusion, removal of unuseful instances, binning, resampling, normalisation, or encoding are applied in detail before the construction of well-known models such as Gradient Boosting, Random Forest, and Support Vector Machine along with an Ensemble of them at different stages: prior to enrolment, at the end of the first semester, at the end of the second semester, at the end of the third semester, and at the end of the fourth semester. Through the construction of these predictive models that serve as inputs to a decision support system, the application of effective dropout prevention policies can be applied.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.