Qi Zhao scite author profile

A large body of previous models to predict where people look in natural scenes focused on pixel-level image attributes. To bridge the semantic gap between the predictive power of computational saliency models and human behavior, we propose a new saliency architecture that incorporates information at three layers: pixel-level image attributes, object-level attributes, and semantic-level attributes. Object- and semantic-level information is frequently ignored, or only a few sample object categories are discussed where scaling to a large number of object categories is not feasible nor neurally plausible. To address this problem, this work constructs a principled vocabulary of basic attributes to describe object- and semantic-level information thus not restricting to a limited number of object categories. We build a new dataset of 700 images with eye-tracking data of 15 viewers and annotation data of 5,551 segmented objects with fine contours and 12 semantic attributes (publicly available with the paper). Experimental results demonstrate the importance of the object- and semantic-level information in the prediction of visual attention.

show abstract

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios

Zhu

et al. 2021

View full text Add to dashboard Cite

Atypical Visual Saliency in Autism Spectrum Disorder Quantified through Model-Based Eye Tracking

Wang

Jiang

Duchesne

et al. 2015

Neuron

291

258

View full text Add to dashboard Cite

Summary The social difficulties that are a hallmark of autism spectrum disorder (ASD) are thought to arise, at least in part, from atypical attention towards stimuli and their features. To investigate this hypothesis comprehensively, we characterized 700 complex natural scene images with a novel 3-layered saliency model that incorporated pixel-level (e.g., contrast), object-level (e.g., shape), and semantic-level attributes (e.g., faces) on 5551 annotated objects. Compared to matched controls, people with ASD had a stronger image center bias regardless of object distribution, reduced saliency for faces and for locations indicated by social gaze, yet a general increase in pixel-level saliency at the expense of semantic-level saliency. These results were further corroborated by direct analysis of fixation characteristics and investigation of feature interactions. Our results for the first time quantify atypical visual attention in ASD across multiple levels and categories of objects.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Qi Zhao

SALICON: Saliency in Context

SALICON: Reducing the Semantic Gap in Saliency Prediction by Adapting Deep Neural Networks

Predicting human gaze beyond pixels

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios

Atypical Visual Saliency in Autism Spectrum Disorder Quantified through Model-Based Eye Tracking

Contact Info

Product

Resources

About