Efficient visual coding and the predictability of eye movements on natural movies

Vig, Eleonora; Dörr, Michael; Barth, Erhardt

doi:10.1163/156856809789476065

Cited by 35 publications

(41 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus, invariant K with an ROC score of 0.68 is best, followed by S (AUC of 0.66), whereas the worst performing is H with an AUC of 0.64. Similar results that showed this ranking were published in [10] on a substantially different problem: there we were predicting gaze behaviour of new viewers on videos that have already been "seen" (i.e. learned on) by the classifier, as opposed to predicting eye movements on new videos.…”

Section: Quantitative Analysissupporting

confidence: 67%

“…In this paper, we propose a rather simplistic model of bottom-up saliency for dynamic scenes with the aim to keep the number of assumptions (and, implicitly, the number of free parameters) to a minimum. This model is also related to the neurobiological principle of efficient coding [10]. To test our model, we evaluate how well it predicts human eye movements on naturalistic videos both in absolute terms and in comparison with more complex, state-of-the-art saliency models.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Intrinsic Dimensionality Predicts the Saliency of Natural Dynamic Scenes

Vig

Dörr

Martinetz

et al. 2012

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

Abstract-Since visual attention-based computer vision applications have gained popularity, ever more complex, biologicallyinspired models seem to be needed to predict salient locations (or interest points) in naturalistic scenes. In this paper, we explore how far one can go in predicting eye movements by using only basic signal processing, such as image representations derived from efficient coding principles, and machine learning. To this end, we gradually increase the complexity of a model from simple single-scale saliency maps computed on grayscale videos to spatio-temporal multiscale and multispectral representations. Using a large collection of eye movements on high-resolution videos, supervised learning techniques fine-tune the free parameters whose addition is inevitable with increasing complexity. The proposed model, although very simple, demonstrates significant improvement in predicting salient locations in naturalistic videos over four selected baseline models and two distinct data labelling scenarios.Index Terms-Computational models of vision, video analysis, computer vision, spatio-temporal saliency, eye movement prediction, intrinsic dimension, visual attention, interest point detection.

show abstract

Section: Quantitative Analysissupporting

confidence: 67%

Section: Introductionmentioning

confidence: 99%

Intrinsic Dimensionality Predicts the Saliency of Natural Dynamic Scenes

Vig

Dörr

Martinetz

et al. 2012

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

show abstract

“…fixated movie patches with a set of randomly collected movie patches. This analysis was done analogously to the analysis carried out for scene-fixation studies, in which a gaze prediction for real-world scenes is sought (see, e.g., Carmi & Itti, 2006;Parkhurst & Niebur, 2003;Reinagel & Zador, 1999;Tatler et al, 2006;Vig, Dorr, & Barth, 2009). They typically used (static) images of real-world scenes while observers performed free viewing.…”

Section: Justus-liebig-universität Giessen Germanymentioning

confidence: 99%

Visual orienting in dynamic broadband (1/f) noise sequences

Rasche

Gegenfurtner

2010

Attention, Perception, & Psychophysics

View full text Add to dashboard Cite

“…We were able to show that on novel test stimuli, subjects who had received such information performed better than subjects who had not seen the expert's eye movements during training, and that the gaze visualization technique presented here facilitated learning better than a simple gaze display (yellow gaze marker). In principle, any visualization technique that reduces the relative visibility of those regions not attended by the expert might have a similar effect; our choice for this particular technique was motivated by our work on eye movement prediction [Dorr et al 2008;Vig et al 2009], which shows that spectral energy is a good predictor for eye movements. Ultimately, we intend to use similar techniques in a gaze-contingent fashion in order to guide the gaze of an observer ].…”

Section: Resultsmentioning

confidence: 99%

Space-variant spatio-temporal filtering of video for gaze visualization and perceptual learning

Dörr¹,

Jarodzka²,

Barth³

2010

Proceedings of the 2010 Symposium on Eye-Tracking Research &Amp; Applications - ETRA '10

Self Cite

View full text Add to dashboard Cite

We introduce an algorithm for space-variant filtering of video based on a spatio-temporal Laplacian pyramid and use this algorithm to render videos in order to visualize pre-recorded eye movements. Spatio-temporal contrast and colour saturation are reduced as a function of distance to the nearest gaze point of regard, i.e. nonfixated, distracting regions are filtered out, whereas fixated image regions remain unchanged. Results of an experiment in which the eye movements of an expert on instructional videos are visualized with this algorithm, so that the gaze of novices is guided to relevant image locations. show that this visualization technique facilitates the novices' perceptual learning.

show abstract

Efficient visual coding and the predictability of eye movements on natural movies

Cited by 35 publications

References 26 publications

Intrinsic Dimensionality Predicts the Saliency of Natural Dynamic Scenes

Intrinsic Dimensionality Predicts the Saliency of Natural Dynamic Scenes

Visual orienting in dynamic broadband (1/f) noise sequences

Space-variant spatio-temporal filtering of video for gaze visualization and perceptual learning

Contact Info

Product

Resources

About