The development of computer vision based systems dedicated to help visually impaired people to perceive the environment, to orientate and navigate has been the main research subject of many works in the recent years. A significant ensemble of resources has been employed to support the development of sensory substitution devices (SSDs) and electronic travel aids for the rehabilitation of the visually impaired. The Sound of Vision (SoV) project used a comprehensive approach to develop such an SSD, tackling all the challenging aspects that so far restrained the large scale adoption of such systems by the intended audience: Wearability, real-time operation, pervasiveness, usability, cost. This article is set to present the artificial vision based component of the SoV SSD that performs the scene reconstruction and segmentation in outdoor environments. In contrast with the indoor use case, where the system acquires depth input from a structured light camera, in outdoors SoV relies on stereo vision to detect the elements of interest and provide an audio and/or haptic representation of the environment to the user. Our stereo-based method is designed to work with wearable acquisition devices and still provide a real-time, reliable description of the scene in the context of unreliable depth input from the stereo correspondence and of the complex 6 DOF motion of the head-worn camera. We quantitatively evaluate our approach on a custom benchmarking dataset acquired with SoV cameras and provide the highlights of the usability evaluation with visually impaired users.
For most visually impaired people, simple tasks such as understanding the environment or moving safely around it represent huge challenges. The Sound of Vision system was designed as a sensory substitution device, based on computer vision techniques, that encodes any environment in a naturalistic representation through audio and haptic feedback. The present paper presents a study on the usability of this system for visually impaired people in relevant environments. The aim of the study is to assess how well the system is able to help the perception and mobility of the visually impaired participants in real life environments and circumstances. The testing scenarios were devised to allow the assessment of the added value of the Sound of Vision system compared to traditional assistive instruments, such as the white cane. Various data were collected during the tests to allow for a better evaluation of the performance: system configuration, completion times, electro-dermal activity, video footage, user feedback. With minimal training, the system could be successfully used in outdoor environments to perform various perception and mobility tasks. The benefit of the Sound of Vision device compared to the white cane was confirmed by the participants and by the evaluation results to consist in: providing early feedback about static and dynamic objects, providing feedback about elevated objects, walls, negative obstacles (e.g., holes in the ground) and signs.
Environment perception and understanding represent critical aspects in most computer vision systems and/or applications. State-of-the-art techniques to solve this vision task (e.g., semantic instance segmentation) require either dedicated hardware resources to run or a longer execution time. Generally, the main efforts were to improve the accuracy of these methods rather than make them faster. This paper presents a novel solution to speed up the semantic instance segmentation task. The solution combines two state-of-the-art methods from semantic instance segmentation and optical flow fields. To reduce the inference time, the proposed framework (i) runs the inference on every 5th frame, and (ii) for the remaining four frames, it uses the motion map computed by optical flow to warp the instance segmentation output. Using this strategy, the execution time is strongly reduced while preserving the accuracy at state-of-the-art levels. We evaluate our solution on two datasets using available benchmarks. Then, we conclude on the results obtained, highlighting the accuracy of the solution and the real-time operation capability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.