We present a novel method for training a social robot to generate backchannels during human-robot interaction. We address the problem within an off-policy reinforcement learning framework, and show how a robot may learn to produce non-verbal backchannels like laughs, when trained to maximize the engagement and attention of the user. A major contribution of this work is the formulation of the problem as a Markov decision process (MDP) with states defined by the speech activity of the user and rewards generated by quantified engagement levels. The problem that we address falls into the class of applications where unlimited interaction with the environment is not possible (our environment being a human) because it may be time-consuming, costly, impracticable or even dangerous in case a bad policy is executed. Therefore, we introduce deep Q-network (DQN) in a batch reinforcement learning framework, where an optimal policy is learned from a batch data collected using a more controlled policy. We suggest the use of human-to-human dyadic interaction datasets as a batch of trajectories to train an agent for engaging interactions. Our experiments demonstrate the potential of our method to train a robot for engaging behaviors in an offline manner.
Image based object tracking has always remained a challenging task because of the numerous video complexities, such as illumination variations, posture or view-angle alterations, object appearance changes, partial and full occlusions etc. Another important constraint is the necessity of real-time processing of online video stream. The tracking technique and object appearance model play a critical role in the success of a tracker. This work presents a new methodology for object tracking 'IS-ObjTrack', which utilizes a computational intelligence based tracking algorithm, employing the particle swarm optimization (PSO) technique. PSO provides robustness and time efficiency. The major advantage of the proposed IS-ObjTrack is the utilization of histogram of oriented gradients (HOG) for the development of an object appearance model. The proposed HOG based appearance model is readily exploited by PSO for fast Le. real-time object tracking. HOG belongs to the class of gradient based filters, hence shows excellent results for objects with distinguished edges. The appearance model is designed for adaptation, whereby the parameters are updated in this work in an online manner. Experimental comparison with existing intelligent tracking systems shows the efficiency of the proposed ISObjTrack approach.Index Terms -Object tracking, swarm intelligence, appearance model, histogram of oriented gradients, particle swarm optimization.
The ability to generate appropriate verbal and non-verbal backchannels by an agent during humanrobot interaction greatly enhances the interaction experience. Backchannels are particularly important in applications like tutoring and counseling, which require constant attention and engagement of the user. We present here a method for training a robot for backchannel generation during a human-robot interaction within the reinforcement learning (RL) framework, with the goal of maintaining high engagement level. Since online learning by interaction with a human is highly time-consuming and impractical, we take advantage of the recorded human-to-human dataset and approach our problem as a batch reinforcement learning problem. The dataset is utilized as a batch data acquired by some behavior policy. We perform experiments with laughs as a backchannel and train an agent with value-based techniques. In particular, we demonstrate the effectiveness of recurrent layers in the approximate value function for this problem, that boosts the performance in partially observable environments. With off-policy policy evaluation, it is shown that the RL agents are expected to produce more engagement than an agent trained from imitation learning.Keywords human-robot interaction · engagement · partially observable Markov decision process · batch reinforcement learning
A key aspect of social human-robot interaction is natural non-verbal communication. In this work, we train an agent with batch reinforcement learning to generate nods and smiles as backchannels in order to increase the naturalness of the interaction and to engage humans. We introduce the Sequential Random Deep Q-Network (SRDQN) method to learn a policy for backchannel generation, that explicitly maximizes user engagement. The proposed SRDQN method outperforms the existing vanilla Q-learning methods when evaluated using off-policy policy evaluation techniques. Furthermore, to verify the effectiveness of SRDQN, a human-robot experiment has been designed and conducted with an expressive 3d robot head. The experiment is based on a story-shaping game designed to create an interactive social activity with the robot. The engagement of the participants during the interaction is computed from user's social signals like backchannels, mutual gaze and adjacency pair. The subjective feedback from participants and the engagement values strongly indicate that our framework is a step forward towards the autonomous learning of a socially acceptable backchanneling behavior.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.