Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-2521
|View full text |Cite
|
Sign up to set email alerts
|

Speech Driven Backchannel Generation Using Deep Q-Network for Enhancing Engagement in Human-Robot Interaction

Abstract: We present a novel method for training a social robot to generate backchannels during human-robot interaction. We address the problem within an off-policy reinforcement learning framework, and show how a robot may learn to produce non-verbal backchannels like laughs, when trained to maximize the engagement and attention of the user. A major contribution of this work is the formulation of the problem as a Markov decision process (MDP) with states defined by the speech activity of the user and rewards generated … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 22 publications
(26 reference statements)
0
13
0
Order By: Relevance
“…We formulate the problem of backchannel generation as a MDP, as similar to [9]. We use an off-policy actor-critic method to learn from a human-to-human interaction dataset which we employ to sample a batch of trajectories.…”
Section: Proposed Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…We formulate the problem of backchannel generation as a MDP, as similar to [9]. We use an off-policy actor-critic method to learn from a human-to-human interaction dataset which we employ to sample a batch of trajectories.…”
Section: Proposed Methodsmentioning
confidence: 99%
“…Table 1 shows the OPE evaluation results obtained over the test data for three off-policy RL methods: neural fitted Q-learning (NFQ) [21], batch-DQN [9] and our method (AC-CQL) as well as for a supervised learning (SL) method [30] and for a mirroring policy (MP) which mimics speaker's laughter in the dataset [31]. Mirroring can be thought of as an effective baseline policy since imitating other's laughter is very commonplace in human to human social interactions.…”
Section: Off-policy Policy Evaluationmentioning
confidence: 99%
See 2 more Smart Citations
“…Other lines of work have used response tokens as a cue to predict the dynamics of talk and turn-taking [11] or to make inferences about mental and cognitive states [12]. A growing amount of work seeks to model feedback behaviour in human-agent interaction, including by means of response token generation [13,14] and attentive listening systems [15,16]. Despite considerable progress, the place of response tokens in speech technology is by no means settled: they tend to be missed by speech recognizers [17,18,19] and dialog managers have a hard time dealing with them [20], showing that they remain a key issue on which progress towards future generations of voice-interactive technologies and conversational user interfaces depends.…”
Section: Related Workmentioning
confidence: 99%