International Conference on Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction 2010
DOI: 10.1145/1891903.1891912
|View full text |Cite
|
Sign up to set email alerts
|

Focusing computational visual attention in multi-modal human-robot interaction

Abstract: Identifying verbally and non-verbally referred-to objects is an important aspect of human-robot interaction. Most importantly, it is essential to achieve a joint focus of attention and, thus, a natural interaction behavior. In this contribution, we introduce a saliencybased model that reflects how multi-modal referring acts influence the visual search, i.e. the task to find a specific object in a scene. Therefore, we combine positional information obtained from pointing gestures with contextual knowledge about… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
43
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 40 publications
(43 citation statements)
references
References 51 publications
0
43
0
Order By: Relevance
“…The Wilcoxon rank sum test was used to compare resection volumes, channel counts, and seizure stereotypy calculated using Levenshtein distance, a measure of the differences in channel sequences. 25 Standard error measurements were used throughout.…”
Section: Classification Of Electrodesmentioning
confidence: 99%
“…The Wilcoxon rank sum test was used to compare resection volumes, channel counts, and seizure stereotypy calculated using Levenshtein distance, a measure of the differences in channel sequences. 25 Standard error measurements were used throughout.…”
Section: Classification Of Electrodesmentioning
confidence: 99%
“…At the right hand side in figure 2.2 the attended object is shown. Schauerte and Fink [11] join contextual information from language with a stimulus driven saliency model to create a top down saliency map. The attention in this map is directed towards a certain region by means of a pointing gesture.…”
Section: Existing Modelsmentioning
confidence: 99%
“…If the angle is large, than the value is low. In this case, a normal distribution with a zero mean is used to create a saliency map (Schauerte and Fink [11]). Equation 5.17 shows the normal distribution, where σ is the standard deviation.…”
Section: Mathematical Description Of a Pointing Gesturementioning
confidence: 99%
“…To extract all salient proto-object regions that attract the attention, we apply a location-based inhibition of return (see [40]) mechanism on the saliency map (see, e.g., [5], [8], [15], [18]). To this end, we use the accumulator to select the most salient proto-object region and inhibit all pixels within the estimated outline by setting their saliency to zero.…”
Section: Systemmentioning
confidence: 99%