Mining Educational Data to Predict Academic Dropouts: a Case Study in Blended Learning Course

Sukhbaatar, Otgontsetseg; Ogata, Kohichi; Usagawa, Tsuyoshi

doi:10.1109/tencon.2018.8650138

Cited by 19 publications

(6 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Results show that the performance of predictive models strongly varies across courses, even when they are generated with data collected from a single institution. In Sukhbaatar et al [38], the authors used a decision tree analysis on LMS data with the goal of predict (until the middle of the semester) students that are at-risk of failing or dropout in a blended course. Results showed that this approach worked well to predict the dropouts.…”

Section: Blended Coursesmentioning

confidence: 99%

Predicting Students Success in Blended Learning—Evaluating Different Interactions Inside Learning Management Systems

et al. 2019

View full text Add to dashboard Cite

Algorithms and programming are some of the most challenging topics faced by students during undergraduate programs. Dropout and failure rates in courses involving such topics are usually high, which has raised attention towards the development of strategies to attenuate this situation. Machine learning techniques can help in this direction by providing models able to detect at-risk students earlier. Therefore, lecturers, tutors or staff can pedagogically try to mitigate this problem. To early predict at-risk students in introductory programming courses, we present a comparative study aiming to find the best combination of datasets (set of variables) and classification algorithms. The data collected from Moodle was used to generate 13 distinct datasets based on different aspects of student interactions (cognitive presence, social presence and teaching presence) inside the virtual environment. Results show there are no statistically significant difference among models generated from the different datasets and that the counts of interactions together with derived attributes are sufficient for the task. The performances of the models varied for each semester, with the best of them able to detect students at-risk in the first week of the course with AUC ROC from 0.7 to 0.9. Moreover, the use of SMOTE to balance the datasets did not improve the performance of the models.

show abstract

Section: Blended Coursesmentioning

confidence: 99%

Predicting Students Success in Blended Learning—Evaluating Different Interactions Inside Learning Management Systems

et al. 2019

View full text Add to dashboard Cite

show abstract

“…Além disso, DT possuem fácil interpretação quanto às suas regras de predição (Louppe, 2014). Estas particularidades fazem desse modelo um algoritmo de aprendizado popular e muito difundido para a predição da evasão escolar (Pereira & Zambrano, 2017;Sukhbaatar et al, 2018). A RF é um modelo baseado em árvores de decisão, que lida bem com conjunto de dados de alta dimensão (Hastie et al, 2009).…”

Section: Modelagemunclassified

Modelo de Predição de Evasão Escolar com Base em Dados de Autoavaliação de Cursos de Graduação

Oliveira,

Medeiros

2024

RBIE

View full text Add to dashboard Cite

A evasão escolar é um desafio diário para instituições de ensino, no caso específico do ensino superior as altas taxas acarretam perdas financeiras e escassez de profissionais no mercado. Esta pesquisa teve como objetivo desenvolver e avaliar um modelo preditivo para identificar alunos propensos à evasão, utilizando dados de um modelo semestral de autoavaliação dos cursos de graduação da Universidade Federal da Paraíba (UFPB). Utilizando a mineração de dados educacionais e a metodologia CRISP-EDM, o estudo analisou a relação entre evasão escolar e autoavaliação institucional, seguido de análise exploratória e preparação dos dados para classificação. Diversas técnicas de modelagem, como Árvore de Decisão, Floresta Aleatória e Máquinas de Vetores de Suporte, foram aplicadas, sendo os modelos avaliados por métricas de desempenho, revelando uma acurácia de 87,97%, precisão de 91,72%, recall de 91,67% e medida F de 91,57% na identificação de alunos com alta probabilidade de evasão. Cerca de 59% dos alunos ativos da UFPB admitidos a partir de 2017 demonstraram probabilidade de abandonar seus cursos nos testes do modelo preditivo proposto. Essas informações podem embasar decisões institucionais e a implementação de políticas e ações eficazes contra a evasão, visando melhorar os resultados acadêmicos. O estudo contribui para avanços na predição de evasão escolar, fornecendo insights valiosos para decisões e estratégias preventivas na UFPB e outras instituições de ensino superior.

show abstract

“…This underscores the pressing need for comprehensive methodologies that seamlessly integrate feature selection with predictive modeling. Table 2 provides a succinct summary of diverse feature selection methods employed in EDM research, including manual selection [25]- [30], filter-based techniques such as correlation and information gain [31]- [37], and wrapper methods such as genetic algorithms and Principal Component Analysis (PCA) [38]- [43]. These diverse approaches collectively contribute to the evolving landscape of feature selection in EDM, paving the way for more robust predictive models.…”

Section: Literature Reviewmentioning

confidence: 99%

An Adaptive Feature Selection Algorithm for Student Performance Prediction

Roy,

Farid

2024

IEEE Access

View full text Add to dashboard Cite

Educational Data Mining (EDM) is used to ameliorate the teaching and learning process by analyzing and classifying data that can be applied to predict the students' academic performance, and students' dropout rate, as well as instructors' performance. The prediction of student performance is complicated by the vast and diverse range of variables from academic records to behavioral and health metrics. In this paper, we have introduced a new Adaptive Feature Selection Algorithm (AFSA) by amalgamating an ensemble approach for initial feature ranking with normalized mean ranking from five distinct methods to enhance robustness. The proposed method iteratively selects the best features by adjusting its threshold based on each feature's rank to ensure significant contributions to model accuracy and also effectively reduces dataset complexity. We have tested the performance of the proposed feature selection algorithm using five machine learning classifiers: Logistic Regression (LR), K-Nearest Neighbour (KNN), Support Vector Machine (SVM), Naïve Bayes (NB) classifier, and Decision Tree (DT) classifier on four student performance datasets. The experimental results highlight the proposed method significantly decreases feature count by an average feature reduction factor of 5.7, significantly streamlining datasets while maintaining competitive cross-validation accuracy, marking it as a valuable tool in the field of educational data analytics.

show abstract

Mining Educational Data to Predict Academic Dropouts: a Case Study in Blended Learning Course

Cited by 19 publications

References 6 publications

Predicting Students Success in Blended Learning—Evaluating Different Interactions Inside Learning Management Systems

Predicting Students Success in Blended Learning—Evaluating Different Interactions Inside Learning Management Systems

Modelo de Predição de Evasão Escolar com Base em Dados de Autoavaliação de Cursos de Graduação

An Adaptive Feature Selection Algorithm for Student Performance Prediction

Contact Info

Product

Resources

About