Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction 2015
DOI: 10.1145/2696454.2696467
|View full text |Cite
|
Sign up to set email alerts
|

Embodied Collaborative Referring Expression Generation in Situated Human-Robot Interaction

Abstract: To facilitate referential communication between humans and robots and mediate their differences in representing the shared environment, we are exploring embodied collaborative models for referring expression generation (REG). Instead of a single minimum description to describe a target object, episodes of expressions are generated based on human feedback during human-robot interaction. We particularly investigate the role of embodiment such as robot gesture behaviors (i.e., pointing to an object) and human's g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
37
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 62 publications
(37 citation statements)
references
References 21 publications
(17 reference statements)
0
37
0
Order By: Relevance
“…Many of the early works in this space focused on relatively limited datasets, using synthesized images of objects in artificial scenes or limited sets of real-world objects in simplified environments [20,7,15]. Recently, the research focus has shifted to more complex natural image datasets and has expanded to include the Referring Expression Comprehension task [13,19,31] as well as to real-world interactions with robotics [4,3]. One reason this has become feasible is that several large-scale REG datasets have been collected at a scale where deep learning models can be applied.…”
Section: Introductionmentioning
confidence: 99%
“…Many of the early works in this space focused on relatively limited datasets, using synthesized images of objects in artificial scenes or limited sets of real-world objects in simplified environments [20,7,15]. Recently, the research focus has shifted to more complex natural image datasets and has expanded to include the Referring Expression Comprehension task [13,19,31] as well as to real-world interactions with robotics [4,3]. One reason this has become feasible is that several large-scale REG datasets have been collected at a scale where deep learning models can be applied.…”
Section: Introductionmentioning
confidence: 99%
“…These techniques have been successfully applied in HRI for children with Autism Spectrum Disorders (ASD) [100], [101]. Attention formulation could be achieved either by deictic words or vocal expressions such as "look here, see me, look right, or next one", by pointing gesture, by using line of sight, by using communication cue such as eye gazing [102], [103] or by combination of vocal and gesture commands [104], [105].…”
Section: A Joint Attention Formulationmentioning
confidence: 99%
“…Finally, previous work addressed the difficulty of common grounding due to the perceptual difference between humans and machines (Liu, Fang, and Chai 2012;Fang, Doering, and Chai 2015). However, such problems are specific to human-machine dialogues, and instead we focus on a more general difficulty of common grounding due to complex ambiguity and uncertainty introduced by continuous and partially-observable context.…”
Section: Related Workmentioning
confidence: 99%