Millions of people are blind worldwide. Sensory substitution (SS) devices (e.g., vOICe) can assist the blind by encoding a video stream into a sound pattern, recruiting visual brain areas for auditory analysis via crossmodal interactions and plasticity. SS devices often require extensive training to attain limited functionality. In contrast to conventional attention-intensive SS training that starts with visual primitives (e.g., geometrical shapes), we argue that sensory substitution can be engaged efficiently by using stimuli (such as textures) associated with intrinsic crossmodal mappings. Crossmodal mappings link images with sounds and tactile patterns. We show that intuitive SS sounds can be matched to the correct images by naive sighted participants just as well as by intensively-trained participants. This result indicates that existing crossmodal interactions and amodal sensory cortical processing may be as important in the interpretation of patterns by SS as crossmodal plasticity (e.g., the strengthening of existing connections or the formation of new ones), especially at the earlier stages of SS usage. An SS training procedure based on crossmodal mappings could both considerably improve participant performance and shorten training times, thereby enabling SS devices to significantly expand blind capabilities.
To restore vision for the blind, several prosthetic approaches have been explored that convey raw images to the brain. So far, these schemes all suffer from a lack of bandwidth. An alternate approach would restore vision at the cognitive level, bypassing the need to convey sensory data. A wearable computer captures video and other data, extracts important scene knowledge, and conveys that to the user in compact form. Here, we implement an intuitive user interface for such a device using augmented reality: each object in the environment has a voice and communicates with the user on command. With minimal training, this system supports many aspects of visual cognition: obstacle avoidance, scene understanding, formation and recall of spatial memories, navigation. Blind subjects can traverse an unfamiliar multi-story building on their first attempt. To spur further development in this domain, we developed an open-source environment for standardized benchmarking of visual assistive devices.
The brain constructs a representation of temporal properties of events, such as duration and frequency, but the underlying neural mechanisms are under debate. One open question is whether these mechanisms are unisensory or multisensory. Duration perception studies provide some evidence for a dissociation between auditory and visual timing mechanisms; however, we found active crossmodal interaction between audition and vision for rate perception, even when vision and audition were never stimulated together. After exposure to 5 Hz adaptors, people perceived subsequent test stimuli centered around 4 Hz to be slower, and the reverse after exposure to 3 Hz adaptors. This aftereffect occurred even when the adaptor and test were different modalities that were never presented together. When the discrepancy in rate between adaptor and test increased, the aftereffect was attenuated, indicating that the brain uses narrowly-tuned channels to process rate information. Our results indicate that human timing mechanisms for rate perception are not entirely segregated between modalities and have substantial implications for models of how the brain encodes temporal features. We propose a model of multisensory channels for rate perception, and consider the broader implications of such a model for how the brain encodes timing.
Neuroscience investigations are most often focused on the prediction of future perception or decisions based on prior brain states or stimulus presentations. However, the brain can also process information retroactively, such that later stimuli impact conscious percepts of the stimuli that have already occurred (called “postdiction”). Postdictive effects have thus far been mostly unimodal (such as apparent motion), and the models for postdiction have accordingly been limited to early sensory regions of one modality. We have discovered two related multimodal illusions in which audition instigates postdictive changes in visual perception. In the first illusion (called the “Illusory Audiovisual Rabbit”), the location of an illusory flash is influenced by an auditory beep-flash pair that follows the perceived illusory flash. In the second illusion (called the “Invisible Audiovisual Rabbit”), a beep-flash pair following a real flash suppresses the perception of the earlier flash. Thus, we showed experimentally that these two effects are influenced significantly by postdiction. The audiovisual rabbit illusions indicate that postdiction can bridge the senses, uncovering a relatively-neglected yet critical type of neural processing underlying perceptual awareness. Furthermore, these two new illusions broaden the Double Flash Illusion, in which a single real flash is doubled by two sounds. Whereas the double flash indicated that audition can create an illusory flash, these rabbit illusions expand audition’s influence on vision to the suppression of a real flash and the relocation of an illusory flash. These new additions to auditory-visual interactions indicate a spatio-temporally fine-tuned coupling of the senses to generate perception.
A subset of sensory substitution (SS) devices translate images into sounds in real time using a portable computer, camera, and headphones. Perceptual constancy is the key to understanding both functional and phenomenological aspects of perception with SS. In particular, constancies enable object externalization, which is critical to the performance of daily tasks such as obstacle avoidance and locating dropped objects. In order to improve daily task performance by the blind, and determine if constancies can be learned with SS, we trained blind (N = 4) and sighted (N = 10) individuals on length and orientation constancy tasks for 8 days at about 1 h per day with an auditory SS device. We found that blind and sighted performance at the constancy tasks significantly improved, and attained constancy performance that was above chance. Furthermore, dynamic interactions with stimuli were critical to constancy learning with the SS device. In particular, improved task learning significantly correlated with the number of spontaneous left-right head-tilting movements while learning length constancy. The improvement from previous head-tilting trials even transferred to a no-head-tilt condition. Therefore, not only can SS learning be improved by encouraging head movement while learning, but head movement may also play an important role in learning constancies in the sighted. In addition, the learning of constancies by the blind and sighted with SS provides evidence that SS may be able to restore vision-like functionality to the blind in daily tasks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.