The term "believability" is often used to describe expectations concerning virtual agents. In this paper, we analyze which factors influence the believability of the agent acting as the software assistant. We consider several factors such as embodiment, communicative behavior, and emotional capabilities. We conduct a perceptive study where we analyze the role of plausible and/or appropriate emotional displays in relation to believability. We also investigate how people judge the believability of the agent, and whether it provokes social reactions of humans toward it. Finally, we evaluate the respective impact of embodiment and emotion over believability judgments. The results of our study show that (a) appropriate emotions lead to higher perceived believability, (b) the notion of believability is closely correlated with the two major socio-cognitive variables, namely competence and warmth, and (c) considering an agent as believable can be different from having a human-like attitude toward it. Finally, a primacy of emotion behavior over embodiment while judging believability is also hypothesized from free responses given by the participants of this experiment.
is an open access repository that collects the work of Arts et Métiers ParisTech researchers and makes it freely available over the web where possible. 25One of the challenges of designing virtual humans is the definition of appropriate models of the relation between realistic emotions and the coordination of behaviors in several 27 modalities. In this paper, we present the annotation, representation and modeling of multimodal visual behaviors occurring during complex emotions. We illustrate our work 29 using a corpus of TV interviews. This corpus has been annotated at several levels of information: communicative acts, emotion labels, and multimodal signs. We have defined a 31 copy-synthesis approach to drive an Embodied Conversational Agent from these different levels of information. The second part of our paper focuses on a model of complex 33(superposition and masking of) emotions in facial expressions of the agent. We explain how the complementary aspects of our work on corpus and computational model is used 35 to specify complex emotional behaviors.
Emotion, mood, and stress recognition (EMSR) has been studied in laboratory settings for decades. In particular, physiological signals are widely used to detect and classify affective states in lab conditions. However, physiological reactions to emotional stimuli have been found to differ in laboratory and natural settings. Thanks to recent technological progress (e.g., in wearables) the creation of EMSR systems for a large number of consumers during their everyday activities is increasingly possible. Therefore, datasets created in the wild are needed to insure the validity and the exploitability of EMSR models for real-life applications. In this paper, we initially present common techniques used in laboratory settings to induce emotions for the purpose of physiological dataset creation. Next, advantages and challenges of data collection in the wild are discussed. To assess the applicability of existing datasets to real-life applications, we propose a set of categories to guide and compare at a glance different methodologies used by researchers to collect such data. For this purpose, we also introduce a visual tool called Graphical Assessment of Real-life Application-Focused Emotional Dataset (GARAFED). In the last part of the paper, we apply the proposed tool to compare existing physiological datasets for EMSR in the wild and to show possible improvements and future directions of research. We wish for this paper and GARAFED to be used as guidelines for researchers and developers who aim at collecting affect-related data for real-life EMSR-based applications.
This study investigated which features of AVATAR laughter are perceived threatening for individuals with a fear of being laughed at (gelotophobia), and individuals with no gelotophobia. Laughter samples were systematically varied (e.g., intensity, laughter pitch, and energy for the voice, intensity of facial actions of the face) in three modalities: animated facial expressions, synthesized auditory laughter vocalizations, and motion capture generated puppets displaying laughter body movements. In the online study 123 adults completed, the GELOPH <15 > (Ruch and Proyer, 2008a,b) and rated randomly presented videos of the three modalities for how malicious, how friendly, how real the laughter was (0 not at all to 8 extremely). Additionally, an open question asked which markers led to the perception of friendliness/maliciousness. The current study identified features in all modalities of laughter stimuli that were perceived as malicious in general, and some that were gelotophobia specific. For facial expressions of AVATARS, medium intensity laughs triggered highest maliciousness in the gelotophobes. In the auditory stimuli, the fundamental frequency modulations and the variation in intensity were indicative of maliciousness. In the body, backwards and forward movements and rocking vs. jerking movements distinguished the most malicious from the least malicious laugh. From the open answers, the shape and appearance of the lips curling induced feelings that the expression was malicious for non-gelotophobes and that the movement round the eyes, elicited the face to appear as friendly. This was opposite for gelotophobes. Gelotophobia savvy AVATARS should be of high intensity, containing lip and eye movements and be fast, non-repetitive voiced vocalization, variable and of short duration. It should not contain any features that indicate a down-regulation in the voice or body, or indicate voluntary/cognitive modulation.
Food and eating are inherently social activities taking place, for example, around the dining table at home, in restaurants, or in public spaces. Enjoying eating with others, often referred to as "commensality," positively affects mealtime in terms of, among other factors, food intake, food choice, and food satisfaction. In this paper we discuss the concept of "Computational Commensality," that is, technology which computationally addresses various social aspects of food and eating. In the past few years, Human-Computer Interaction started to address how interactive technologies can improve mealtimes. However, the main focus has been made so far on improving the individual's experience, rather than considering the inherently social nature of food consumption. In this survey, we first present research from the field of social psychology on the social relevance of Food-and Eating-related Activities (F&EA). Then, we review existing computational models and technologies that can contribute, in the near future, to achieving Computational Commensality. We also discuss the related research challenges and indicate future applications of such new technology that can potentially improve F&EA from the commensality perspective.
The AVLaughterCycle project aims at developing an audiovisual laughing machine, able to detect and respond to user's laughs. Laughter is an important cue to reinforce the engagement in human-computer interactions. As a first step toward this goal, we have implemented a system capable of recording the laugh of a user and responding to it with a similar laugh. The output laugh is automatically selected from an audiovisual laughter database by analyzing acoustic similarities with the input laugh. It is displayed by an Embodied Conversational Agent, animated using the audio-synchronized facial movements of the subject who originally uttered the laugh. The application is fully implemented, works in real time and a large audiovisual laughter database has been recorded as part of the project. This paper presents AVLaughterCycle, its underlying components, the freely available laughter database and the application architecture. The paper also includes evaluations of several core components of the application. Objective tests show that the similarity search engine, though simple, significantly outperforms chance for grouping laughs Portions of this work have been presented in "Proceedings of eNTERFACE'09" [36]. by speaker or type. This result can be considered as a first measurement for computing acoustic similarities between laughs. A subjective evaluation has also been conducted to measure the influence of the visual cues on the users' evaluation of similarity between laughs.
Full-body human movement is characterized by fine-grain expressive qualities that humans are easily capable of exhibiting and recognizing in others' movement. In sports (e.g., martial arts) and performing arts (e.g., dance), the same sequence of movements can be performed in a wide range of ways characterized by different qualities, often in terms of subtle (spatial and temporal) perturbations of the movement. Even a non-expert observer can distinguish between a top-level and average performance by a dancer or martial artist. The difference is not in the performed movements-the same in both cases-but in the "quality" of their performance.In this article, we present a computational framework aimed at an automated approximate measure of movement quality in full-body physical activities. Starting from motion capture data, the framework computes low-level (e.g., a limb velocity) and high-level (e.g., synchronization between different limbs) movement features. Then, this vector of features is integrated to compute a value aimed at providing a quantitative assessment of movement quality approximating the evaluation that an external expert observer would give of the same sequence of movements. Next, a system representing a concrete implementation of the framework is proposed. Karate is adopted as a testbed. We selected two different katas (i.e., detailed choreographies of movements in karate) characterized by different overall attitudes and expressions (aggressiveness, meditation), and we asked seven athletes, having various levels of experience and age, to perform them. Motion capture data were collected from the performances and were analyzed with the system. The results of the automated analysis were compared with the scores given by 14 karate experts who rated the same performances. Results show that the movementquality scores computed by the system and the ratings given by the human observers are highly correlated (Pearson's correlations r = 0.84, p = 0.001 and r = 0.75, p = 0.005).
In this paper, we present (i) a computational model of Dynamic Symmetry of human movement, and (ii) a system to teach this movement quality (symmetry or asymmetry) by means of an interactive sonification exergame based on IMU sensors and the EyesWeb XMI software platform. The implemented system is available as a demo at the workshop.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.