Proceedings of the 2021 International Conference on Multimodal Interaction 2021
DOI: 10.1145/3462244.3479944
|View full text |Cite
|
Sign up to set email alerts
|

Engagement Rewarded Actor-Critic with Conservative Q-Learning for Speech-Driven Laughter Backchannel Generation

Abstract: We propose a speech-driven laughter backchannel generation model to reward engagement during human-agent interaction. We formulate the problem as a Markov decision process where speech signal represents the state and the objective is to maximize human engagement. Since online training is often impractical in the case of human-agent interaction, we utilize the existing humanto-human dyadic interaction datasets to train our agent for the backchannel generation task. We address the problem using an actor-critic m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(1 citation statement)
references
References 26 publications
0
1
0
Order By: Relevance
“…To the best of our knowledge, this work (including our preliminary papers [24], [25]) is the first work on batch reinforcement learning for engaging backchannel behavior of a robot. We also note that a very recent work [68] uses conservative Q-learning as a batch-RL algorithm to learn a backchannel policy that enhances engagement while statistically matching the human laughter generation in dyadic conversations. It, however, only trains for laugh events and does not include any user study that validates the proposed method on a real human-robot interaction setting.…”
Section: Related Workmentioning
confidence: 99%
“…To the best of our knowledge, this work (including our preliminary papers [24], [25]) is the first work on batch reinforcement learning for engaging backchannel behavior of a robot. We also note that a very recent work [68] uses conservative Q-learning as a batch-RL algorithm to learn a backchannel policy that enhances engagement while statistically matching the human laughter generation in dyadic conversations. It, however, only trains for laugh events and does not include any user study that validates the proposed method on a real human-robot interaction setting.…”
Section: Related Workmentioning
confidence: 99%