We propose a novel and fast multiscale feature detection and description approach that exploits the benefits of nonlinear scale spaces. Previous attempts to detect and describe features in nonlinear scale spaces are highly time consuming due to the computational burden of creating the nonlinear scale space. In this paper we propose to use recent numerical schemes called Fast Explicit Diffusion (FED) embedded in a pyramidal framework to dramatically speed-up feature detection in nonlinear scale spaces. In addition, we introduce a Modified-Local Difference Binary (M-LDB) descriptor that is highly efficient, exploits gradient information from the nonlinear scale space, is scale and rotation invariant and has low storage requirements. We present an extensive evaluation that shows the excellent compromise between speed and performance of our approach compared to state-of-the-art methods such as BRISK, ORB, SURF, SIFT and KAZE.
In this paper, we introduce the concept of dense scene flow for visual SLAM applications. Traditional visual SLAM methods assume static features in the environment and that a dominant part of the scene changes only due to camera egomotion. These assumptions make traditional visual SLAM methods prone to failure in crowded real-world dynamic environments with many independently moving objects, such as the typical environments for the visually impaired. By means of a dense scene flow representation, moving objects can be detected. In this way, the visual SLAM process can be improved considerably, by not adding erroneous measurements into the estimation, yielding more consistent and improved localization and mapping results. We show large-scale visual SLAM results in challenging indoor and outdoor crowded environments with real visually impaired users. In particular, we performed experiments inside the Atocha railway station and in the city-center of Alcalá de Henares, both in Madrid, Spain. Our results show that the combination of visual SLAM and dense scene flow allows to obtain an accurate localization, improving considerably the results of traditional visual SLAM methods and GPS-based approaches.
The aim of this article is focused on the design of an obstacle detection system for assisting visually impaired people. A dense disparity map is computed from the images of a stereo camera carried by the user. By using the dense disparity map, potential obstacles can be detected in 3D in indoor and outdoor scenarios. A ground plane estimation algorithm based on RANSAC plus filtering techniques allows the robust detection of the ground in every frame. A polar grid representation is proposed to account for the potential obstacles in the scene. The design is completed with acoustic feedback to assist visually impaired users while approaching obstacles. Beep sounds with different frequencies and repetitions inform the user about the presence of obstacles. Audio bone conducting technology is employed to play these sounds without interrupting the visually impaired user from hearing other important sounds from its local environment. A user study participated by four visually impaired volunteers supports the proposed system.
Life-long visual localization is one of the most challenging topics in robotics over the last few years. The difficulty of this task is in the strong appearance changes that a place suffers due to dynamic elements, illumination, weather or seasons. In this paper, we propose a novel method (ABLE-M) to cope with the main problems of carrying out a robust visual topological localization along time. The novelty of our approach resides in the description of sequences of monocular images as binary codes, which are extracted from a global LDB descriptor and efficiently matched using FLANN for fast nearest neighbor search. Besides, an illumination invariant technique is applied. The usage of the proposed binary description and matching method provides a reduction of memory and computational costs, which is necessary for long-term performance. Our proposal is evaluated in different life-long navigation scenarios, where ABLE-M outperforms some of the main state-of-the-art algorithms, such as WI-SURF, BRIEF-Gist, FAB-MAP or SeqSLAM. Tests are presented for four public datasets where a same route is traversed at different times of day or night, along the months or across all four seasons.
We propose a continuous optimization method for solving dense 3D scene flow problems from stereo imagery. As in recent work, we represent the dynamic 3D scene as a collection of rigidly moving planar segments. The scene flow problem then becomes the joint estimation of pixel-to-segment assignment, 3D position, normal vector and rigid motion parameters for each segment, leading to a complex and expensive discrete-continuous optimization problem. In contrast, we propose a purely continuous formulation which can be solved more efficiently. Using a fine superpixel segmentation that is fixed a-priori, we propose a factor graph formulation that decomposes the problem into photometric, geometric, and smoothing constraints. We initialize the solution with a novel, high-quality initialization method, then independently refine the geometry and motion of the scene, and finally perform a global nonlinear refinement using Levenberg-Marquardt. We evaluate our method in the challenging KITTI Scene Flow benchmark, ranking in third position, while being 3 to 30 times faster than the top competitors (x37 [1] and x3.75 [2]).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.