Figure 1: Retinal implant ('bionic eye') for restoring vision to people with visual impairment. A) Light captured by a camera is transformed into electrical pulses delivered through a microelectrode array to stimulate the retina (adapted with permission from [39]). B) To create meaningful artificial vision, we explored deep learning-based scene simplification as a preprocessing strategy for retinal implants (reproduced from doi:10.6084/m9.figshare.13652927 under CC-BY 4.0). As a proof of concept, we used a neurobiologically inspired computational model to generate realistic predictions of simulated prosthetic vision (SPV), and asked sighted subjects (i.e., virtual patients) to identify people and cars in a novel SPV dataset of natural outdoor scenes. In the future, this setup may be used as input to a real retinal implant.
Gaze direction is an evolutionarily important mechanism in daily social interactions. It reflects a person’s internal cognitive state, spatial locus of interest, and predicts future actions. Studies have used static head images presented foveally and simple synthetic tasks to find that gaze orients attention and facilitates target detection at the cued location in a sustained manner. Little is known about how people’s natural gaze behavior, including eyes, head, and body movements, jointly orient covert attention, microsaccades, and facilitate performance in more ecological dynamic scenes. Participants completed a target person detection task with videos of real scenes. The videos showed people looking toward (valid cue) or away from a target (invalid cue) location. We digitally manipulated the individuals in the videos directing gaze to create three conditions: whole-intact (head and body movements), floating heads (only head movements), and headless bodies (only body movements). We assessed their impact on participants’ behavioral performance and microsaccades during the task. We show that, in isolation, an individual’s head or body orienting toward the target-person direction led to facilitation in detection that is transient in time (200 ms). In contrast, only the whole-intact condition led to sustained facilitation (500 ms). Furthermore, observers executed microsaccades more frequently towards the cued direction for valid trials, but this bias was sustained in time only with the joint presence of head and body parts. Together, the results differ from previous findings with foveally presented static heads. In more real-world scenarios and tasks, sustained attention requires the presence of the whole-intact body of the individuals dynamically directing their gaze.
Static gaze cues presented in central vision result in observer shifts of covert attention and eye movements, and benefits in perceptual performance in the detection of simple targets. Less is known about how dynamic gazer behaviors with head and body motion influence search eye movements and performance in more ecological perceptual tasks in real-world scenes. Participants searched for a target person (yes/no task, 50% presence) while watching videos of one to three gazers looking at a designated person (50 % valid gaze cue, looking at the target). We digitally manipulated videos to create three conditions of gazers: floating heads (only head movements), headless bodies, and the baseline intact. We show that valid dynamic gaze cues guided participants’ eye movements (up to 3 fixations) closer to the target, shortened the times to fixate within 2° of the target, reduced fixations to the gazers, and improved target detection. Gaze cue guidance was smallest when the gazer’s head was removed from the videos. Separate judgments estimating gaze showed that the reduced eye movement guidance from lower body cueing is related to observers’ difficulty extracting gaze information without the presence of the head. Together, the study extends previous work by evaluating the impact of ecologically relevant dynamic gazer behaviors on search with realistic cluttered videos.
Attending to other people's gaze is evolutionary important to make inferences about intentions and actions. Gaze influences covert attention and triggers eye movements. However, we know little about how the brain controls the fine-grain dynamics of eye movements during gaze following. Observers followed people's gaze shifts in videos during search and we related the observer eye movement dynamics to the timecourse of gazer head movements extracted by a deep neural network. We show that the observers' brains use information in the visual periphery to execute predictive saccades that anticipate the information in the gazer's head direction by 190-350 ms. The brain simultaneously monitors moment-to-moment changes in the gazer's head velocity to dynamically alter eye movements and re-fixate the gazer (reverse saccades) when the head accelerates before the initiation of the first forward gaze-following saccade. Using saccade-contingent manipulations of the videos, we experimentally show that the reverse saccades are planned concurrently with the first forward gaze-following saccade and have a functional role in reducing subsequent errors fixating on the gaze goal. Together, our findings characterize the inferential and functional nature of the fine-grain eye movement dynamics of social attention.
Face processing is a fast and efficient process due to its evolutionary and social importance. A majority of people direct their first eye movement to a featureless point just below the eyes that maximizes accuracy in recognizing a person's identity and gender. Yet, the exact properties or features of the face that guide the first eye movements and reduce fixational variability are unknown. Here, we manipulated the presence of the facial features and the spatial configuration of features to investigate their effect on the location and variability of first and second fixations to peripherally presented faces. Our results showed that observers can utilize the face outline, individual facial features, and feature spatial configuration to guide the first eye movements to their preferred point of fixation. The eyes have a preferential role in guiding the first eye movements and reducing fixation variability. Eliminating the eyes or altering their position had the greatest influence on the location and variability of fixations and resulted in the largest detriment to face identification performance. The other internal features (nose and mouth) also contribute to reducing fixation variability. A subsequent experiment measuring detection of single features showed that the eyes have the highest detectability (relative to other features) in the visual periphery providing a strong sensory signal to guide the oculomotor system. Together, the results suggest a flexible multiple-cue approach that might be a robust solution to cope with how the varying eccentricities in the real world influence the ability to resolve individual feature properties and the preferential role of the eyes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.