The perception of objects in our visual world is influenced by not only their low-level visual features such as shape and color, but also their high-level features such as meaning and semantic relations among them. While it has been shown that low-level features in real-world scenes guide eye movements during scene inspection and search, the influence of semantic similarity among scene objects on eye movements in such situations has not been investigated. Here we study guidance of eye movements by semantic similarity among objects during real-world scene inspection and search. By selecting scenes from the LabelMe object-annotated image database and applying Latent Semantic Analysis (LSA) to the object labels, we generated semantic saliency maps of real-world scenes based on the semantic similarity of scene objects to the currently fixated object or the search target. An ROC analysis of these maps as predictors of subjects’ gaze transitions between objects during scene inspection revealed a preference for transitions to objects that were semantically similar to the currently inspected one. Furthermore, during the course of a scene search, subjects’ eye movements were progressively guided toward objects that were semantically similar to the search target. These findings demonstrate substantial semantic guidance of eye movements in real-world scenes and show its importance for understanding real-world attentional control.
Recently, there has been great interest among vision researchers in developing computational models that predict the distribution of saccadic endpoints in naturalistic scenes. In many of these studies, subjects are instructed to view scenes without any particular task in mind so that stimulus-driven (bottom-up) processes guide visual attention. However, whenever there is a search task, goal-driven (top-down) processes tend to dominate guidance, as indicated by attention being systematically biased toward image features that resemble those of the search target. In the present study, we devise a top-down model of visual attention during search in complex scenes based on similarity between the target and regions of the search scene. Similarity is defined for several feature dimensions such as orientation or spatial frequency using a histogram-matching technique. The amount of attentional guidance across visual feature dimensions is predicted by a previously introduced informativeness measure. We use eye-movement data gathered from participants’ search of a set of naturalistic scenes to evaluate the model. The model is found to predict the distribution of saccadic endpoints in search displays nearly as accurately as do other observers’ eye-movement data in the same displays.
Purpose Google Glass provides a platform that can be easily extended to include a vision enhancement tool. We have implemented an augmented vision system on Glass, which overlays enhanced edge information over the wearer’s real world view, to provide contrast-improved central vision to the Glass wearers. The enhanced central vision can be naturally integrated with scanning. Methods Goggle Glass’s camera lens distortions were corrected by using an image warping. Since the camera and virtual display are horizontally separated by 16mm, and the camera aiming and virtual display projection angle are off by 10°, the warped camera image had to go through a series of 3D transformations to minimize parallax errors before the final projection to the Glass’ see-through virtual display. All image processes were implemented to achieve near real-time performance. The impacts of the contrast enhancements were measured for three normal vision subjects, with and without a diffuser film to simulate vision loss. Results For all three subjects, significantly improved contrast sensitivity was achieved when the subjects used the edge enhancements with a diffuser film. The performance boost is limited by the Glass camera’s performance. The authors assume this accounts for why performance improvements were observed only with the diffuser filter condition (simulating low vision). Conclusions Improvements were measured with simulated visual impairments. With the benefit of see-through augmented reality edge enhancement, natural visual scanning process is possible, and suggests that the device may provide better visual function in a cosmetically and ergonomically attractive format for patients with macular degeneration.
Watching 3D content using a stereoscopic display may cause various discomforting symptoms, including eye strain, blurred vision, double vision, and motion sickness. Numerous studies have reported motion-sickness-like symptoms during stereoscopic viewing, but no causal linkage between specific aspects of the presentation and the induced discomfort has been explicitly proposed. Here, we describe several causes, in which stereoscopic capture, display, and viewing differ from natural viewing resulting in static and, importantly, dynamic distortions that conflict with the expected stability and rigidity of the real world. This analysis provides a basis for suggested changes to display systems that may alleviate the symptoms, and suggestions for future studies to determine the relative contribution of the various effects to the unpleasant symptoms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.