Vision-based localization on robots and vehiclesremains unsolved when extreme appearance change and viewpoint change are present simultaneously. The current state of the art approaches to this challenge either deal with only one of these two problems; for example FAB-MAP (viewpoint invariance) or SeqSLAM (appearanceinvariance), or use extensive training within the test environment, an impractical requirement in many application scenarios. In this paper we significantly improve the viewpoint invariance of the SeqSLAM algorithm by using state-of-the-art deep learning techniques to generate synthetic viewpoints. Our approach is different to other deep learning approaches in that it does not rely on the ability of the CNN network to learn invariant features, but only to produce good enough depth images from day-time imagery only. We evaluate the system on a new multi-lane day-night car dataset specifically gathered to simultaneously test both appearance and viewpoint change. Results demonstrate that the use of synthetic viewpoints improves the maximum recall achieved at 100% precision by a factor of 2.2 and maximum recall by a factor of 2.7, enabling correct place recognition across multiple road lanes and significantly reducing the time between correct localizations 1 .
Vision-based place recognition is becoming an increasingly viable component of navigation systems for autonomous robots and personal aids. However, attaining robustness to variations in environmental conditions—such as time of day, weather and season—and camera viewpoint remains a major challenge. Featureless, sequence-based place recognition techniques have demonstrated promise, but often rely on long image sequences, manually-tuned parameters and exhaustive sequence match searching through multiple locations and image scales. In this paper, we address these deficiencies by implementing a condition-invariant, sequence-based place recognition algorithm suitable for networked environments, such as city streets, and routes with lateral platform shift, such as multiple-lane roads. We achieve this capability by augmenting the traditional 1D image database with a directed graph to describe the branching of contiguous sections of imagery at intersections. A particle filter is then used to efficiently explore these paths, as well as various lateral positions synthesized by rescaling imagery. Our proposed approach eliminates manual tuning of sequence length parameters, improves localization on branched routes, improves overall place recognition accuracy and coverage, and reduces computational requirements. We evaluated the new method against the original SeqSLAM and SMART algorithms on two day–night, road-based datasets and a summer–winter train dataset, where it attained superior precision-recall performance and coverage in all environments. Together, these contributions represent a significant step towards the provision of a robust, near parameter-free condition- and viewpoint-invariant visual place recognition capability for vehicles and robots.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.