Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction 2007
DOI: 10.1145/1228716.1228727
|View full text |Cite
|
Sign up to set email alerts
|

Using vision, acoustics, and natural language for disambiguation

Abstract: Creating a human-robot interface is a daunting experience. Capabilities and functionalities of the interface are dependent on the robustness of many different sensor and input modalities. For example, object recognition poses problems for state-of-the-art vision systems. Speech recognition in noisy environments remains problematic for acoustic systems. Natural language understanding and dialog are often limited to specific domains and baffled by ambiguous or novel utterances. Plans based on domain-specific tas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2008
2008
2012
2012

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 29 publications
(13 citation statements)
references
References 26 publications
0
12
0
Order By: Relevance
“…Moreover, for robotic use, much accuracy is required for approaching a person or for interacting movements such as making gestures or keeping eye contacts. Finding people by on-board sensors installed on robots are another field that is actively studied (Gockley et al, 2007;Fransen et al, 2007). However, this is effective only after the robots have approached close enough to the person in target.…”
Section: The Sensor Layer: Measuring Human Positionmentioning
confidence: 99%
“…Moreover, for robotic use, much accuracy is required for approaching a person or for interacting movements such as making gestures or keeping eye contacts. Finding people by on-board sensors installed on robots are another field that is actively studied (Gockley et al, 2007;Fransen et al, 2007). However, this is effective only after the robots have approached close enough to the person in target.…”
Section: The Sensor Layer: Measuring Human Positionmentioning
confidence: 99%
“…However, we would argue that it would be a grave mistake to discard earlier work on symbol grounding. If, for example, the systems of [12,16,3,7,4], could be utilised in one and the same system, we would truly be able to take a step forward as a community. Therefore non-intrusiveness is an important requirement on our binding system [10].…”
Section: Background and Motivationmentioning
confidence: 99%
“…When an object is placed in front of the robot, the visual subarchitecture processes the object as described previously, ultimately creating a visual proxy for it. When the tutor makes an assertion about the object (or relation), we use recency information to bind the communication proxy for the deictic reference to the newest visual proxy 4 . The communication proxy contains binding features for all of the adjectives used in the utterance.…”
Section: An Example Of Interactive Learningmentioning
confidence: 99%
“…Finding people by on-board sensors installed on robots are another field that is actively studied [13,14]. However, this is effective only after the robots have approached close enough to the person in target.…”
Section: Sensor Layer: Human Position Estimationmentioning
confidence: 99%