Thomas S. A. Wallis scite author profile

Learning the properties of an image associated with human gaze placement is important both for understanding how biological systems explore the environment and for computer vision applications. There is a large literature on quantitative eye movement models that seeks to predict fixations from images (sometimes termed "saliency" prediction). A major problem known to the field is that existing model comparison metrics give inconsistent results, causing confusion. We argue that the primary reason for these inconsistencies is because different metrics and models use different definitions of what a "saliency map" entails. For example, some metrics expect a model to account for image-independent central fixation bias whereas others will penalize a model that does. Here we bring saliency evaluation into the domain of information by framing fixation prediction models probabilistically and calculating information gain. We jointly optimize the scale, the center bias, and spatial blurring of all models within this framework. Evaluating existing metrics on these rephrased models produces almost perfect agreement in model rankings across the metrics. Model performance is separated from center bias and spatial blurring, avoiding the confounding of these factors in model comparison. We additionally provide a method to show where and how models fail to capture information in the fixations on the pixel level. These methods are readily extended to spatiotemporal models of fixation scanpaths, and we provide a software package to facilitate their use.visual attention | eye movements | probabilistic modeling | likelihood | point processes H umans move their eyes about three times/s when exploring the environment, fixating areas of interest with the highresolution fovea. How do we determine where to fixate to learn about the scene in front of us? This question has been studied extensively from the perspective of "bottom-up" attentional guidance (1), often in a "free-viewing" task in which a human observer explores a static image for some seconds while his or her eye positions are recorded (Fig. 1A). Eye movement prediction is also applied in domains from advertising to efficient object recognition. In computer vision the problem of predicting fixations from images is often referred to as "saliency prediction," while to others "saliency" refers explicitly to some set of low-level image features (such as edges or contrast). In this paper we are concerned with predicting fixations from images, taking no position on whether the features that guide eye movements are "low" or "high" level.The field of eye movement prediction is quite mature: Beginning with the influential model of Itti et al. (1), there are now over 50 quantitative fixation prediction models, including around 10 models that seek to incorporate "top-down" effects (see refs. 2-4 for recent reviews and analyses of this extensive literature). Many of these models are designed to be biologically plausible whereas others aim purely at prediction (e.g., ref. 5). Progress is meas...

show abstract

Using fuzzy signal detection theory to determine why experienced and trained drivers respond faster than novices in a hazard perception test

Wallis

Horswill

2007

Accident Analysis & Prevention

142

View full text Add to dashboard Cite

Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics

Kümmerer

Wallis

Bethge

2018

View full text Add to dashboard Cite

Dozens of new models on fixation prediction are published every year and compared on open benchmarks such as MIT300 and LSUN. However, progress in the field can be difficult to judge because models are compared using a variety of inconsistent metrics. Here we show that no single saliency map can perform well under all metrics. Instead, we propose a principled approach to solve the benchmarking problem by separating the notions of saliency models, maps and metrics. Inspired by Bayesian decision theory, we define a saliency model to be a probabilistic model of fixation density prediction and a saliency map to be a metric-specific prediction derived from the model density which maximizes the expected performance on that metric given the model density. We derive these optimal saliency maps for the most commonly used saliency metrics (AUC, sAUC, NSS, CC, SIM, KL-Div) and show that they can be computed analytically or approximated with high precision.We show that this leads to consistent rankings in all metrics and avoids the penalties of using one saliency map for all metrics. Our method allows researchers to have their model compete on many different metrics with state-of-the-art in those metrics: "good" models will perform well in all metrics.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Thomas S. A. Wallis

Understanding Low- and High-Level Contributions to Fixation Prediction

Information-theoretic model comparison unifies saliency metrics

Using fuzzy signal detection theory to determine why experienced and trained drivers respond faster than novices in a hazard perception test

Saliency Benchmarking Made Easy: Separating Models, Maps and Metrics

Contact Info

Product

Resources

About