Student success prediction in MOOCs

Gardner, Josh; Brooks, Christopher

doi:10.1007/s11257-018-9203-z

Cited by 149 publications

(119 citation statements)

References 103 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…transfer with the baselines (Label-Truth, Label-Truth-AE, Naive Transfer, In-Situ Learning, and Instance-Based Transfer), for the similar pairs of source and target 6.00.1x→6.00.1x (within offerings of one course) are shown in Table 4 and Figure 8a, and the dissimilar pairs 6.00.2x → 6.00.1x (across two courses) in Table 5 and Figure 8b. Note that the performance of In-Situ Learning and 8. It shows that Naive Transfer overfits to the source domain from week 5.…”

Section: Transfer Learning Resultsmentioning

confidence: 94%

See 1 more Smart Citation

Transfer Learning using Representation Learning in Massive Open Online Courses

Ding

Wang

Hemberg

et al. 2019

Proceedings of the 9th International Conference on Learning Analytics &Amp; Knowledge

View full text Add to dashboard Cite

In a Massive Open Online Course (MOOC), predictive models of student behavior can support multiple aspects of learning, including instructor feedback and timely intervention. Ongoing courses, when the student outcomes are yet unknown, must rely on models trained from the historical data of previously offered courses. It is possible to transfer models, but they often have poor prediction performance. One reason is features that inadequately represent predictive attributes common to both courses. We present an automated transductive transfer learning approach that addresses this issue. It relies on problem-agnostic, temporal organization of the MOOC clickstream data, where, for each student, for multiple courses, a set of specific MOOC event types is expressed for each time unit. It consists of two alternative transfer methods based on representation learning with auto-encoders: a passive approach using transductive principal component analysis and an active approach that uses a correlation alignment loss term. With these methods, we investigate the transferability of dropout prediction across similar and dissimilar MOOCs and compare with known methods. Results show improved model transferability and suggest that the methods are capable of automatically learning a feature representation that expresses common predictive characteristics of MOOCs.

show abstract

Section: Transfer Learning Resultsmentioning

confidence: 94%

“…Feature identification is a critical precursor to prediction [8]. Some human selected and engineered features are page views, video interactions, forum posts, and content interactions.…”

Section: Related Workmentioning

confidence: 99%

Transfer Learning using Representation Learning in Massive Open Online Courses

Ding

Wang

Hemberg

et al. 2019

Proceedings of the 9th International Conference on Learning Analytics &Amp; Knowledge

View full text Add to dashboard Cite

show abstract

“…A key area of research has been methods for feature engineering, or extracting structured information from raw data (i.e. clickstream server logs, natural language in discussion posts) [8].…”

Section: A Educational Big Data In the Mooc Eramentioning

confidence: 99%

MORF: A Framework for Predictive Modeling and Replication At Scale With Privacy-Restricted MOOC Data

Gardner

Brooks

Andrés

et al. 2018

2018 IEEE International Conference on Big Data (Big Data)

Self Cite

View full text Add to dashboard Cite

Big data repositories from online learning platforms such as Massive Open Online Courses (MOOCs) represent an unprecedented opportunity to advance research on education at scale and impact a global population of learners. To date, such research has been hindered by poor reproducibility and a lack of replication, largely due to three types of barriers: experimental, inferential, and data. We present a novel system for large-scale computational research, the MOOC Replication Framework (MORF), to jointly address these barriers. We discuss MORF's architecture, an opensource platform-as-a-service (PaaS) which includes a simple, flexible software API providing for multiple modes of research (predictive modeling or production rule analysis) integrated with a high-performance computing environment. All experiments conducted on MORF use executable Docker containers which ensure complete reproducibility while allowing for the use of any software or language which can be installed in the linux-based Docker container. Each experimental artifact is assigned a DOI and made publicly available. MORF has the potential to accelerate and democratize research on its massive data repository, which currently includes over 200 MOOCs, as demonstrated by initial research conducted on the platform. We also highlight ways in which MORF represents a solution template to a more general class of problems faced by computational researchers in other domains.

show abstract

“…In the survey presented above and in other work [21], we have described the common practices of predictive modeling experiments in learning analytics. These include (a) a massive space of potential models due to many data sources, feature types, and algorithms used; (b) relatively small collections of datasets, for example, even the largest prior MOOC studies of which we are aware evaluate around 40 MOOCs (i.e., [47,17]) and (c) large individual datasets, which make repeated model-fitting undesirable, if not intractable.…”

Section: The Case For Bayesian Model Evaluationmentioning

confidence: 99%

“…We consider the following models in our experiment: (1) classical decision trees (CART) [7], (2) L2 (or "ridge") regularized logistic regression (L2LR); (3) gradient boosted tree (Adaboost) [12], used as a a stand-in for the widely used [21] random forest method 5 ; (4) support vector machine (SVM) with linear kernel; (5) naïve Bayes (NB). These represent five of the most commonly used modeling algorithm in predictive models of student success in MOOCs [21]. A summary of the models considered, and any special preprocessing, is shown in Table 4.…”

Section: Algorithms and Hyperparametersmentioning

confidence: 99%

Evaluating Predictive Models of Student Success: Closing the Methodological Gap

Gardner

Brooks

2018

Learning Analytics

Self Cite

View full text Add to dashboard Cite

Model evaluation -the process of making inferences about the performance of predictive models -is a critical component of predictive modeling research in learning analytics. We survey the state of the practice with respect to model evaluation in learning analytics, which overwhelmingly uses only naïve methods for model evaluation or statistical tests which are not appropriate for predictive model evaluation. We conduct a critical comparison of both null hypothesis significance testing (NHST) and a preferred Bayesian method for model evaluation. Finally, we apply three methods -the naïve average commonly used in learning analytics, NHST, and Bayesian -to a predictive modeling experiment on a large set of MOOC data. We compare 96 different predictive models, including different feature sets, statistical modeling algorithms, and tuning hyperparameters for each, using this case study to demonstrate the different experimental conclusions these evaluation techniques provide.

show abstract

Student success prediction in MOOCs

Cited by 149 publications

References 103 publications

Transfer Learning using Representation Learning in Massive Open Online Courses

Transfer Learning using Representation Learning in Massive Open Online Courses

MORF: A Framework for Predictive Modeling and Replication At Scale With Privacy-Restricted MOOC Data

Evaluating Predictive Models of Student Success: Closing the Methodological Gap

Contact Info

Product

Resources

About