Abstract:Digital learning environments generate a precise record of the actions learners take as they interact with learning materials and complete exercises towards comprehension. With this high quantity of sequential data comes the potential to apply time series models to learn about underlying behavioral patterns and trends that characterize successful learning based on the granular record of student actions. There exist several methods for looking at longitudinal, sequential data like those recorded from learning e… Show more
“…Within a set time period of one week, various representative values were calculated to summarize a learner's behavior. We constructed twelve features, most of which were taken from a thoroughly-described set of features of a similar experiment by [13], but with week-by-week comparison features removedwhile [13] predicted when a learner would drop-out, we are focusing on if the learner eventually receives certification. To account for the loss of information about how far a learner has progressed through the course, we included two extra features not included in [13] (see features 6 and 12 in Table 2).…”
Section: Methodsmentioning
confidence: 99%
“…Very few studies have combined predictive modeling with real-world interventions in a MOOC. In [20], next resource suggestions were made using a predictive model of behavior [19]. On residential campuses, predictive models of drop-out have been operationalized in the form of dispatching counselors for flagged students [18], an approach which can have unintended side effects of signaling to students that they are not likely to pass the course, and thus catalyzing a greater rate of drop-out than without the intervention.…”
“…Within a set time period of one week, various representative values were calculated to summarize a learner's behavior. We constructed twelve features, most of which were taken from a thoroughly-described set of features of a similar experiment by [13], but with week-by-week comparison features removedwhile [13] predicted when a learner would drop-out, we are focusing on if the learner eventually receives certification. To account for the loss of information about how far a learner has progressed through the course, we included two extra features not included in [13] (see features 6 and 12 in Table 2).…”
Section: Methodsmentioning
confidence: 99%
“…Very few studies have combined predictive modeling with real-world interventions in a MOOC. In [20], next resource suggestions were made using a predictive model of behavior [19]. On residential campuses, predictive models of drop-out have been operationalized in the form of dispatching counselors for flagged students [18], an approach which can have unintended side effects of signaling to students that they are not likely to pass the course, and thus catalyzing a greater rate of drop-out than without the intervention.…”
“…The novel intuition of our application of this to student course sequences is that instead of learning the structure of language by training on sequences of words, we are learning the structure of learner behaviour from sequences of page views. It was previously found that clickstream behaviours within MOOCs could be predicted using a Recurrent Neural Network (RNN) with 70% accuracy, compared to the 45% accuracy provided by the expected path through the course when following the existing course structure (Tang, Peterson, & Pardos, 2017). This work builds on the observation that patterns exist in learner clickstream behaviours.…”
Section: Representation Learning With Skip-gramsmentioning
We introduce a novel approach to visualizing temporal clickstream behaviour in the context of a degree-satisfying online course, Habitable Worlds, offered through Arizona State University. The current practice for visualizing behaviour within a digital learning environment has been to generate plots based on hand engineered or coded features using domain knowledge. While this approach has been effective in relating behaviour to known phenomena, features crafted from domain knowledge are not likely well suited to make unfamiliar phenomena salient and thus can preclude discovery. We introduce a methodology for organically surfacing behavioural regularities from clickstream data, conducting an expert in-the-loop hyperparameter search, and identifying anticipated as well as newly discovered patterns of behaviour. While these visualization techniques have been used before in the broader machine learning community to better understand neural networks and relationships between word vectors, we apply them to online behavioural learner data and go a step further; exploring the impact of the parameters of the model on producing tangible, non-trivial observations of behaviour that are suggestive of pedagogical improvement to the course designers and instructors. The methodology introduced in this paper led to an improved understanding of passing and non-passing student behaviour in the course and is widely applicable to other datasets of clickstream activity where investigators and stakeholders wish to organically surface principal patterns of behaviour.
NOTES FOR PRACTICE• Continuous representation visualization can produce a high-level view of emergent student behavior online without the need for defining features or tagging • Differential visualization of passing and non-passing student course behaviors can help identify deep and shallow learning strategies and provide instructors with essential information for modifying the curricula to discourage strategies associated with failure • Involving instructors in the tuning of the visualization and model parameters produces analyses with a desirable mixture of expected and unexpected, but explainable, patterns • Layering on additional data, such when students create a discussion post, further contextualizes insight into student learning strategies from visualizations
“…In an online course context, the available input includes event stream (or clickstream) data which features inputs of mixed type. The output can be any logged outcome, such as certification in the course [23], stop-out [24,25] or predicting what action the learner is going to take next in the course [26 ]. In this case, behavior is the input and the output.…”
As the modal sources of data in education have shifted over the past few decades, so too have the modeling paradigms applied to these data. In this paper, we overview the principle foci of modeling in the areas of standardized testing, computer tutoring, and online courses from whence these big data have come, and provide a rationale for their adoption in each context. As these data become more behavioral in nature, we argue that a shift to connectionist paradigms of modeling is called for as well as a reaffirming of the ethical responsibilities of big data analysis in education.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.