International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction 2010
DOI: 10.1145/1891903.1891910
|View full text |Cite
|
Sign up to set email alerts
|

Facilitating multiparty dialog with gaze, gesture, and speech

Abstract: We study how synchronized gaze, gesture and speech rendered by an embodied conversational agent can influence the flow of conversations in multiparty settings. We review a computational framework for turn taking that provides the foundation for tracking and communicating intentions to hold, release, or take control of the conversational floor. We then present details of the implementation of the approach in an embodied conversational agent and describe experiments with the system in a shared task setting. Fina… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
77
0

Year Published

2011
2011
2023
2023

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 123 publications
(77 citation statements)
references
References 18 publications
0
77
0
Order By: Relevance
“…Early work on developing architectures to manage this problem considers how nonverbal cues used by virtual agents on a screen can affect perception of lifelikeness (Cassell & Thorisson, 1999). More recent work on engagement with virtual agents uses more elaborate turn-taking models and supports multiparty conversation (Bohus & Horvitz, 2010). Research in spoken dialog systems also attempts to control the timing of turn-taking over the single modality of speech (Raux & Eskenazi, 2009).…”
Section: Related Workmentioning
confidence: 99%
“…Early work on developing architectures to manage this problem considers how nonverbal cues used by virtual agents on a screen can affect perception of lifelikeness (Cassell & Thorisson, 1999). More recent work on engagement with virtual agents uses more elaborate turn-taking models and supports multiparty conversation (Bohus & Horvitz, 2010). Research in spoken dialog systems also attempts to control the timing of turn-taking over the single modality of speech (Raux & Eskenazi, 2009).…”
Section: Related Workmentioning
confidence: 99%
“…Next, it asks the user for an order (action ask_order), which is followed by the act of listening for the order (expectation order(X)). Once the order has been provided, the robot consults the multi-DOA estimation module if any DOAs were detected during the act of listening (expectation dirs(As)), filtering them for consistency 3 . Depending upon the number of consistent DOAs detected, one of the following situations may be triggered: a) if no consistent DOAs were detected ( situation A([])), it accepts the order and asks whether the user wants something else, but it does not face the user; b) if only one consistent DOA was detected (situation A ([A])), it accepts the order, faces the user and asks the user whether he/she wants something else; or c) if more than one consistent DOA was detected (situation G(As)), it rejects the order, adds the DOAs to Ps (action push([A,B,...], Ps)), tells the users to speak one at a time, and returns to the initial situation to retake the order while providing Ps as an argument, which results in the robot facing each consistent DOA and taking an order for each one.…”
Section: :G Oto(d)mentioning
confidence: 99%
“…Bohus and Horvitz [5] developed a system capable of differentiating speakers in a turn-based speaking environment. The system was able to determine who was speaking to whom by evaluating hand gestures and other cues.…”
Section: Related Workmentioning
confidence: 99%