Previous work on visual SLAM has shown that indexing on space and scale facilitates the use of feature descriptors for matching in real-time systems and that this can significantly increase robustness. However, the performance gains necessarily diminish as uncertainty about camera position increases. In this paper we address this issue by introducing a further level of indexing based on appearance, using low order Haar wavelet coefficients. This enables fast look up of descriptors even when the camera is lost, hence allowing efficient relocalisation. Results of experiments on a range of real world test cases demonstrate that the method is effective, including single frame relocalisation rates up to 90% using relatively low numbers of descriptor comparisons. 1 Introduction Recent years has seen the emergence of real-time vision systems capable of tracking 3-D camera pose whilst simultaneously mapping the surrounding environment. Of particular note are those based on the probabilistic formulations which underlie the simultaneous localisation and mapping (SLAM) techniques used in robotics [4, 7]. These have demonstrated the benefit of harnessing the uncertainty relationships encoded in such formulations for focusing image processing operations when and where required, hence enabling real-time operation. Add to this their natural online processing structure and their ability to maintain covariance relationships across estimated parameters, and it is clear that these systems have the potential to provide effective mechanisms for real-time location sensing. Nevertheless, achieving robust performance during erratic non-smooth camera motion or in visually difficult environments remains a challenge for such systems. A key element is the data association, or feature matching, problem. If uncertainty is low, then image search regions derived from a probabilistic filter will be small, constraining the spatial search for matches and hence reducing computation and likelihood of mismatch. This in turn allows the use of weaker matching techniques, e.g. template matching, in order to further reduce computational load. Of course, it also runs the risk of losing track should uncertainty increase-search regions grow and the probability of mismatch increases, resulting in bad data association and filter instability. An effective way of gaining improved robustness is to base matching on more distinctive descriptors, such as those developed in recent years for object recognition [6, 10]. This is the approach adopted by Chekhlov et al. [5], who utilise the spatial gradient descriptors which form the basis of the Scale-Invariant Feature Transform (SIFT) [6]. They
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.