Engagement Rewarded Actor-Critic with Conservative Q-Learning for Speech-Driven Laughter Backchannel Generation

Bayramoglu, Oyku; Erzin, Engin; Sezgin, T. Metin; Yemez, Y.

doi:10.1145/3462244.3479944

Cited by 4 publications

(1 citation statement)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To the best of our knowledge, this work (including our preliminary papers [24], [25]) is the first work on batch reinforcement learning for engaging backchannel behavior of a robot. We also note that a very recent work [68] uses conservative Q-learning as a batch-RL algorithm to learn a backchannel policy that enhances engagement while statistically matching the human laughter generation in dyadic conversations. It, however, only trains for laugh events and does not include any user study that validates the proposed method on a real human-robot interaction setting.…”

Section: Related Workmentioning

confidence: 99%

Training Socially Engaging Robots: Modeling Backchannel Behaviors with Batch Reinforcement Learning

Hussain¹,

Erzin²,

Sezgin³

et al. 2022

IEEE Trans. Affective Comput.

View full text Add to dashboard Cite

A key aspect of social human-robot interaction is natural non-verbal communication. In this work, we train an agent with batch reinforcement learning to generate nods and smiles as backchannels in order to increase the naturalness of the interaction and to engage humans. We introduce the Sequential Random Deep Q-Network (SRDQN) method to learn a policy for backchannel generation, that explicitly maximizes user engagement. The proposed SRDQN method outperforms the existing vanilla Q-learning methods when evaluated using off-policy policy evaluation techniques. Furthermore, to verify the effectiveness of SRDQN, a human-robot experiment has been designed and conducted with an expressive 3d robot head. The experiment is based on a story-shaping game designed to create an interactive social activity with the robot. The engagement of the participants during the interaction is computed from user's social signals like backchannels, mutual gaze and adjacency pair. The subjective feedback from participants and the engagement values strongly indicate that our framework is a step forward towards the autonomous learning of a socially acceptable backchanneling behavior.

show abstract

Section: Related Workmentioning

confidence: 99%