Abstract:Predictive models of student success in Massive Open Online Courses (MOOCs) are a critical component of effective content personalization and adaptive interventions. In this article we review the state of the art in predictive models of student success in MOOCs and present a categorization of MOOC research according to the predictors (features), prediction (outcomes), and underlying theoretical model. We critically survey work across each category, providing data on the raw data source, feature engineering, st… Show more
“…transfer with the baselines (Label-Truth, Label-Truth-AE, Naive Transfer, In-Situ Learning, and Instance-Based Transfer), for the similar pairs of source and target 6.00.1x→6.00.1x (within offerings of one course) are shown in Table 4 and Figure 8a, and the dissimilar pairs 6.00.2x → 6.00.1x (across two courses) in Table 5 and Figure 8b. Note that the performance of In-Situ Learning and 8. It shows that Naive Transfer overfits to the source domain from week 5.…”
Section: Transfer Learning Resultsmentioning
confidence: 94%
“…Feature identification is a critical precursor to prediction [8]. Some human selected and engineered features are page views, video interactions, forum posts, and content interactions.…”
In a Massive Open Online Course (MOOC), predictive models of student behavior can support multiple aspects of learning, including instructor feedback and timely intervention. Ongoing courses, when the student outcomes are yet unknown, must rely on models trained from the historical data of previously offered courses. It is possible to transfer models, but they often have poor prediction performance. One reason is features that inadequately represent predictive attributes common to both courses. We present an automated transductive transfer learning approach that addresses this issue. It relies on problem-agnostic, temporal organization of the MOOC clickstream data, where, for each student, for multiple courses, a set of specific MOOC event types is expressed for each time unit. It consists of two alternative transfer methods based on representation learning with auto-encoders: a passive approach using transductive principal component analysis and an active approach that uses a correlation alignment loss term. With these methods, we investigate the transferability of dropout prediction across similar and dissimilar MOOCs and compare with known methods. Results show improved model transferability and suggest that the methods are capable of automatically learning a feature representation that expresses common predictive characteristics of MOOCs.
“…transfer with the baselines (Label-Truth, Label-Truth-AE, Naive Transfer, In-Situ Learning, and Instance-Based Transfer), for the similar pairs of source and target 6.00.1x→6.00.1x (within offerings of one course) are shown in Table 4 and Figure 8a, and the dissimilar pairs 6.00.2x → 6.00.1x (across two courses) in Table 5 and Figure 8b. Note that the performance of In-Situ Learning and 8. It shows that Naive Transfer overfits to the source domain from week 5.…”
Section: Transfer Learning Resultsmentioning
confidence: 94%
“…Feature identification is a critical precursor to prediction [8]. Some human selected and engineered features are page views, video interactions, forum posts, and content interactions.…”
In a Massive Open Online Course (MOOC), predictive models of student behavior can support multiple aspects of learning, including instructor feedback and timely intervention. Ongoing courses, when the student outcomes are yet unknown, must rely on models trained from the historical data of previously offered courses. It is possible to transfer models, but they often have poor prediction performance. One reason is features that inadequately represent predictive attributes common to both courses. We present an automated transductive transfer learning approach that addresses this issue. It relies on problem-agnostic, temporal organization of the MOOC clickstream data, where, for each student, for multiple courses, a set of specific MOOC event types is expressed for each time unit. It consists of two alternative transfer methods based on representation learning with auto-encoders: a passive approach using transductive principal component analysis and an active approach that uses a correlation alignment loss term. With these methods, we investigate the transferability of dropout prediction across similar and dissimilar MOOCs and compare with known methods. Results show improved model transferability and suggest that the methods are capable of automatically learning a feature representation that expresses common predictive characteristics of MOOCs.
“…A key area of research has been methods for feature engineering, or extracting structured information from raw data (i.e. clickstream server logs, natural language in discussion posts) [8].…”
Section: A Educational Big Data In the Mooc Eramentioning
Big data repositories from online learning platforms such as Massive Open Online Courses (MOOCs) represent an unprecedented opportunity to advance research on education at scale and impact a global population of learners. To date, such research has been hindered by poor reproducibility and a lack of replication, largely due to three types of barriers: experimental, inferential, and data. We present a novel system for large-scale computational research, the MOOC Replication Framework (MORF), to jointly address these barriers. We discuss MORF's architecture, an opensource platform-as-a-service (PaaS) which includes a simple, flexible software API providing for multiple modes of research (predictive modeling or production rule analysis) integrated with a high-performance computing environment. All experiments conducted on MORF use executable Docker containers which ensure complete reproducibility while allowing for the use of any software or language which can be installed in the linux-based Docker container. Each experimental artifact is assigned a DOI and made publicly available. MORF has the potential to accelerate and democratize research on its massive data repository, which currently includes over 200 MOOCs, as demonstrated by initial research conducted on the platform. We also highlight ways in which MORF represents a solution template to a more general class of problems faced by computational researchers in other domains.
“…In the survey presented above and in other work [21], we have described the common practices of predictive modeling experiments in learning analytics. These include (a) a massive space of potential models due to many data sources, feature types, and algorithms used; (b) relatively small collections of datasets, for example, even the largest prior MOOC studies of which we are aware evaluate around 40 MOOCs (i.e., [47,17]) and (c) large individual datasets, which make repeated model-fitting undesirable, if not intractable.…”
Section: The Case For Bayesian Model Evaluationmentioning
confidence: 99%
“…We consider the following models in our experiment: (1) classical decision trees (CART) [7], (2) L2 (or "ridge") regularized logistic regression (L2LR); (3) gradient boosted tree (Adaboost) [12], used as a a stand-in for the widely used [21] random forest method 5 ; (4) support vector machine (SVM) with linear kernel; (5) naïve Bayes (NB). These represent five of the most commonly used modeling algorithm in predictive models of student success in MOOCs [21]. A summary of the models considered, and any special preprocessing, is shown in Table 4.…”
Model evaluation -the process of making inferences about the performance of predictive models -is a critical component of predictive modeling research in learning analytics. We survey the state of the practice with respect to model evaluation in learning analytics, which overwhelmingly uses only naïve methods for model evaluation or statistical tests which are not appropriate for predictive model evaluation. We conduct a critical comparison of both null hypothesis significance testing (NHST) and a preferred Bayesian method for model evaluation. Finally, we apply three methods -the naïve average commonly used in learning analytics, NHST, and Bayesian -to a predictive modeling experiment on a large set of MOOC data. We compare 96 different predictive models, including different feature sets, statistical modeling algorithms, and tuning hyperparameters for each, using this case study to demonstrate the different experimental conclusions these evaluation techniques provide.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.