Focusing computational visual attention in multi-modal human-robot interaction

Schauerte, Boris; Fink, Gernot A.

doi:10.1145/1891903.1891912

Cited by 40 publications

(43 citation statements)

References 51 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The Wilcoxon rank sum test was used to compare resection volumes, channel counts, and seizure stereotypy calculated using Levenshtein distance, a measure of the differences in channel sequences. 25 Standard error measurements were used throughout.…”

Section: Classification Of Electrodesmentioning

confidence: 99%

Seizure localization using ictal phase-locked high gamma

et al. 2015

View full text Add to dashboard Cite

Objective: To determine whether resection of areas with evidence of intense, synchronized neural firing during seizures is an accurate indicator of postoperative outcome.Methods: Channels meeting phase-locked high gamma (PLHG) criteria were identified retrospectively from intracranial EEG recordings (102 seizures, 46 implantations, 45 patients). Extent of removal of both the seizure onset zone (SOZ) and PLHG was correlated with seizure outcome, classified as good (Engel class I or II, n 5 32) or poor (Engel class III or IV, n 5 13).Results: Patients with good outcomes had significantly greater proportions of both SOZ and the first 4 (early) PLHG sites resected. Improved outcome classification was noted with early PLHG, as measured by the area under the receiver operating characteristic curves (PLHG 0.79, SOZ 0.68) and by odds ratios for resections including at least 75% of sites identified by each measure (PLHG 9.7 [95% CI: 2.3-41.5], SOZ 5.3 [95% CI: 1.2-23.3]). Among patients with resection of at least 75% of the SOZ, 78% (n 5 30) had good outcomes, increasing to 91% when the resection also included at least 75% of early PLHG sites (n 5 22).Conclusions: This study demonstrates the localizing value of early PLHG, which is comparable to that provided by the SOZ. Incorporation of PLHG into the clinical evaluation may improve surgical efficacy and help to focus resections on the most critical areas. Neurology ® 2015;84:2320-2328 GLOSSARY AUROC 5 area under the receiver operating characteristic curve;

show abstract

Section: Classification Of Electrodesmentioning

confidence: 99%

Seizure localization using ictal phase-locked high gamma

et al. 2015

View full text Add to dashboard Cite

show abstract

“…At the right hand side in figure 2.2 the attended object is shown. Schauerte and Fink [11] join contextual information from language with a stimulus driven saliency model to create a top down saliency map. The attention in this map is directed towards a certain region by means of a pointing gesture.…”

Section: Existing Modelsmentioning

confidence: 99%

“…If the angle is large, than the value is low. In this case, a normal distribution with a zero mean is used to create a saliency map (Schauerte and Fink [11]). Equation 5.17 shows the normal distribution, where σ is the standard deviation.…”

Section: Mathematical Description Of a Pointing Gesturementioning

confidence: 99%

Multimodal joint visual attention model for natural human-robot interaction in domestic environments

Domhof

Chandarr

Rudinac

et al. 2015

2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

View full text Add to dashboard Cite

Due to population ageing, the cost of health care will raise in the coming years. One way to help humans, and especially elderly people, is the introduction of domestic robots that can assist people in daily life such that they are less dependent on home care. Joint visual attention models can be used for natural robot-human interaction. Joint visual attention is that two humans or a robot and a human have a shared attention to the same object. This can be accomplished by pointing, eye-gaze or by using speech. The goal of this thesis is to develop a non verbal joint visual attention model for object detection that integrates gestures, gaze, saliency and depth. The question that will be answered in this report is: how can the information from gestures, gaze, saliency and depth be integrated in the most efficient way to determine the object of interest?Existing joint visual attention models only work when the human is in front of the robot, so that the human is in view of the camera. Our model should be more flexible than existing models, so it needs to work in different configurations of human, robot and object. Furthermore, the joint visual attention model should be able to determine the object of interest when the pointing direction or the gaze location is not available.The saliency algorithm of Itti et al. [1] has been used to create a bottom up saliency map. The second bottom-up cue, depth, is determined by means of segmenting the environment to extract the objects. Apart from the bottom-up cues, top-down cues can be used as well. The pointing finger is identified and based on the eigenvalues and eigenvectors of the finger the pointing direction will be retrieved. A pointing map is created by means of the angle between the 3D pointing direction vector and the 3D vector from the pointing finger to the object. A hybrid model, which computes a gaze map, has been developed that switches depending on textureness of the object between texture based approach and color based approach.Depending on the configuration of the human, robot and object, three or four maps are available to determine the object of interest. In some configurations, the pointing map or gaze map is not available. In that case the combined saliency map is obtained by point wise multiplication of these three maps. If all four maps are at our disposal, all maps are added and multiplied by the pointing mask.When the human and robot are opposite of each other and pointing, bottom up saliency and depth are combined, 93.3% of the objects are detected correctly. If the human is standing next to the robot, the gaze map, bottom up saliency map and depth map are combined, then the detection rate is 67.8%. If robot, human and object are standing in a triangular shape, the detection rate is equal to 96.3%.The main contribution is that the joint visual attention model is able to detect objects of interest in different configurations of human, robot and object and it also works when one of the four cues is not available. Furthermore, a hybrid model has ...

show abstract

“…To extract all salient proto-object regions that attract the attention, we apply a location-based inhibition of return (see [40]) mechanism on the saliency map (see, e.g., [5], [8], [15], [18]). To this end, we use the accumulator to select the most salient proto-object region and inhibit all pixels within the estimated outline by setting their saliency to zero.…”

Section: Systemmentioning

confidence: 99%

Multimodal saliency-based attention for object-based scene analysis

Schauerte¹,

Kuhn²,

Kroschel³

et al. 2011

2011 IEEE/RSJ International Conference on Intelligent Robots and Systems

View full text Add to dashboard Cite

Abstract-Multimodal attention is a key requirement for humanoid robots in order to navigate in complex environments and act as social, cognitive human partners. To this end, robots have to incorporate attention mechanisms that focus the processing on the potentially most relevant stimuli while controlling the sensor orientation to improve the perception of these stimuli. In this paper, we present our implementation of audio-visual saliency-based attention that we integrated in a system for knowledge-driven audio-visual scene analysis and object-based world modeling. For this purpose, we introduce a novel isophote-based method for proto-object segmentation of saliency maps, a surprise-based auditory saliency definition, and a parametric 3-D model for multimodal saliency fusion. The applicability of the proposed system is demonstrated in a series of experiments.Index Terms-audio-visual saliency, auditory surprise, isophote-based visual proto-objects, parametric 3-D saliency model, object-based inhibition of return, multimodal attention, scene exploration, hierarchical object analysis, overt attention, active perception

show abstract

Focusing computational visual attention in multi-modal human-robot interaction

Cited by 40 publications

References 51 publications

Seizure localization using ictal phase-locked high gamma

Seizure localization using ictal phase-locked high gamma

Multimodal joint visual attention model for natural human-robot interaction in domestic environments

Multimodal saliency-based attention for object-based scene analysis

Contact Info

Product

Resources

About