Affective and cognitive processes form a rich substrate on which learning plays out. Affective states often influence progress on learning tasks, resulting in positive or negative cycles of affect that impact learning outcomes. Developing a detailed account of the occurrence and timing of cognitive-affective states during learning can inform the design of affective tutorial interventions. In order to advance understanding of learning-centered affect, this paper reports on a study to analyze a video corpus of computer-mediated human tutoring using an automated facial expression recognition tool that detects fine-grained facial movements. The results reveal three significant relationships between facial expression, frustration, and learning: 1) Action Unit 2 (outer brow raise) was negatively correlated with learning gain, 2) Action Unit 4 (brow lowering) was positively correlated with frustration, and 3) Action Unit 14 (mouth dimpling) was positively correlated with both frustration and learning gain. Additionally, early prediction models demonstrated that facial actions during the first five minutes were significantly predictive of frustration and learning at the end of the tutoring session. The results represent a step toward a deeper understanding of learning-centered affective states, which will form the foundation for data-driven design of affective tutoring systems.
Detecting learning-centered affective states is difficult, yet crucial for adapting most effectively to users. Within tutoring in particular, the combined context of student task actions and tutorial dialogue shape the student's affective experience. As we move toward detecting affect, we may also supplement the task and dialogue streams with rich sensor data. In a study of introductory computer programming tutoring, human tutors communicated with students through a text-based interface. Automated approaches were leveraged to annotate dialogue, task actions, facial movements, postural positions, and hand-to-face gestures. These dialogue, nonverbal behavior, and task action input streams were then used to predict retrospective student self-reports of engagement and frustration, as well as pretest/posttest learning gains. The results show that the combined set of multimodal features is most predictive, indicating an additive effect. Additionally, the findings demonstrate that the role of nonverbal behavior may depend on the dialogue and task context in which it occurs. This line of research identifies contextual and behavioral cues that may be leveraged in future adaptive multimodal systems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.