In this work we propose a new CNN+LSTM architecture for camera pose regression for indoor and outdoor scenes. CNNs allow us to learn suitable feature representations for localization that are robust against motion blur and illumination changes. We make use of LSTM units on the CNN output, which play the role of a structured dimensionality reduction on the feature vector, leading to drastic improvements in localization performance. We provide extensive quantitative comparison of CNN-based and SIFT-based localization methods, showing the weaknesses and strengths of each. Furthermore, we present a new large-scale indoor dataset with accurate ground truth from a laser scanner. Experimental results on both indoor and outdoor public datasets show our method outperforms existing deep architectures, and can localize images in hard conditions, e.g., in the presence of mostly textureless surfaces, where classic SIFT-based methods fail.
We propose a graph-based, low-complexity sensor fusion approach for ubiquitous pedestrian indoor positioning using mobile devices. We employ our fusion technique to combine relative motion information based on step detection with WiFi signal strength measurements. The method is based on the well-known particle filter methodology. In contrast to previous work, we provide a probabilistic model for location estimation that is formulated directly on a fully discretized, graph-based representation of the indoor environment. We generate this graph by adaptive quantization of the indoor space, removing irrelevant degrees of freedom from the estimation problem. We evaluate the proposed method in two realistic indoor environments using real data collected from smartphones. In total, our dataset spans about 20 kilometers in distance walked and includes 13 users and four different mobile device types. Our results demonstrate that the filter requires an order of magnitude less particles than state-of-theart approaches while maintaining an accuracy of a few meters. The proposed low-complexity solution not only enables indoor positioning on less powerful mobile devices, but also saves much-needed resources for location-based end-user applications which run on top of a localization service.
Recent advances in the field of content-based image retrieval (CBIR) have made it possible to quickly search large image databases using photographs or video sequences as a query. With appropriately tagged images of places, this technique can be applied to the problem of visual location recognition. While this task has attracted large interest in the community, most existing approaches focus on outdoor environments only. This is mainly due to the fact that the generation of an indoor dataset is elaborate and complex. In order to allow researchers to advance their approaches towards the challenging field of CBIR-based indoor localization and to facilitate an objective comparison of different algorithms, we provide an extensive, high resolution indoor dataset. The free for use dataset includes realistic query sequences with ground truth as well as point cloud data, enabling a localization system to perform 6-DOF pose estimation.
State-of-the-art visual odometry algorithms achieve remarkable efficiency and accuracy. Under realistic conditions, however, tracking failures are inevitable and to continue tracking, a recovery strategy is required. In this paper, we propose a relocalization system that enables realtime, 6D pose recovery for wide baselines. Our approach targets specifically resource-constrained hardware such as mobile phones. By exploiting the properties of low-complexity binary feature descriptors, nearest-neighbor search is performed efficiently using Locality Sensitive Hashing. Our method does not require time-consuming offline training of hash tables and it can be applied to any visual odometry system. We provide a thorough evaluation of effectiveness, robustness and runtime on an indoor test sequence with available ground truth poses. We investigate the system parameterization and compare the relocalization performance for the three binary descriptors BRIEF, unscaled BRIEF and ORB. In contrast to previous work on mobile visual odometry, we are able to quickly recover from tracking failures within maps with thousands of 3D feature points.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.