On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation

Cavallari, Tommaso; Golodetz, Stuart; Lord, Nicholas A.; Valentin, Julien; Stefano, Luigi Di; Torr, Philip H. S.

doi:10.1109/cvpr.2017.31

Cited by 119 publications

(150 citation statements)

References 42 publications

(103 reference statements)

Supporting

Mentioning

150

Contrasting

Order By: Relevance

“…Machine learningbased approaches either replace the 2D-3D matching stage through scene coordinate regression [10,12,16,[52][53][54]79], i.e., they regress the 3D point coordinate in each 2D-3D match, or directly regress the camera pose from an image [8,13,35,36,89]. The former type of methods achieves state-of-the-art localization accuracy in smallscale scenes [12,16,53], but do not seem to easily scale to larger scenes [12]. The latter type of methods have recently been shown to not perform consistently better than image retrieval methods [76], i.e., approaches that approximate the pose of the query image by the pose of the most similar database image [3,38,87].…”

Section: Related Workmentioning

confidence: 99%

“…The scene representation used by localization algorithms is typically recovered from images depicting a given scene. The type of representation can vary from a set of images with associated camera poses [8,75,98], over 3D models constructed from Structure-from-Motion [77,81], to weights encoded in convolutional neural networks (CNNs) [8,10,12,13,35,36,52] or random forests [11,16,79]. In practice, capturing a scene from all possible view-…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization

Larsson

Stenborg

Toft

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Long-term visual localization is the problem of estimating the camera pose of a given query image in a scene whose appearance changes over time. It is an important problem in practice, for example, encountered in autonomous driving. In order to gain robustness to such changes, long-term localization approaches often use segmantic segmentations as an invariant scene representation, as the semantic meaning of each scene part should not be affected by seasonal and other changes. However, these representations are typically not very discriminative due to the limited number of available classes. In this paper, we propose a new neural network, the Fine-Grained Segmentation Network (FGSN), that can be used to provide image segmentations with a larger number of labels and can be trained in a self-supervised fashion. In addition, we show how FGSNs can be trained to output consistent labels across seasonal changes. We demonstrate through extensive experiments that integrating the fine-grained segmentations produced by our FGSNs into existing localization algorithms leads to substantial improvements in localization performance.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization

Larsson

Stenborg

Toft

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

show abstract

“…One online local regression approach is that of [13,12], which showed how to adapt the regression forests of [63] for online use in real time. Their approach achieves stateof-the-art performance on the popular 7-Scenes [63] and Stanford 4 Scenes [68] indoor datasets, and also performs well on some of the easier outdoor scenes from Cambridge Landmarks [36,34,35].…”

Section: Back-project Pointsmentioning

confidence: 99%

“…Indeed, the broader trend in machine learning has been towards replacing models such as regression forests with neural networks that can learn suitable features, rather than trying to hand-craft them manually. However, replacing the forests used by [13,12] with networks is not straightforward. To achieve online relocalisation, they rely on the way in which their forests predict leaves containing reservoirs of points to adapt forests between scenes, and it is tricky to see how this scheme can be easily transferred to work with local regression networks, which tend to directly predict individual points in the training scene.…”

Section: Back-project Pointsmentioning

confidence: 99%

Let's Take This Online: Adapting Scene Coordinate Regression Network Predictions for Online RGB-D Camera Relocalisation

Cavallari

Bertinetto

Mukhoti

et al. 2019

2019 International Conference on 3D Vision (3DV)

Self Cite

View full text Add to dashboard Cite

Many applications require a camera to be relocalised online, without expensive offline training on the target scene. Whilst both keyframe and sparse keypoint matching methods can be used online, the former often fail away from the training trajectory, and the latter can struggle in textureless regions. By contrast, scene coordinate regression (SCoRe) methods generalise to novel poses and can leverage dense correspondences to improve robustness, and recent work has shown how to adapt SCoRe forests between scenes, allowing their state-of-the-art performance to be leveraged online. However, because they use features hand-crafted for indoor use, they do not generalise well to harder outdoor scenes. Whilst replacing the forest with a neural network and learning suitable features for outdoor use is possible, the techniques used to adapt forests between scenes are unfortunately harder to transfer to a network context. In this paper, we address this by proposing a novel way of leveraging a network trained on one scene to predict points in another scene. Our approach replaces the appearance clustering performed by the branching structure of a regression forest with a two-step process that first uses the network to predict points in the original scene, and then uses these predicted points to look up clusters of points from the new scene. We show experimentally that our online approach achieves state-of-the-art performance on both the 7-Scenes and Cambridge Landmarks datasets, whilst running in under 300ms, making it highly effective in live scenarios.

show abstract

“…State-of-the-art approaches for accurate visual localization are based on matches between 2D image and 3D scene coordinates [10,11,16,40,46,47,56,71,72,84]. These 2D-3D matches are either established using explicit feature matching [40,56,71,72,84] or via learning-based scene co- Figure 1: Using further modalities for indoor visual localization.…”

Section: Introductionmentioning

confidence: 99%

Is This the Right Place? Geometric-Semantic Pose Verification for Indoor Visual Localization

Taira

Rocco

Sedlář

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

Visual localization in large and complex indoor scenes, dominated by weakly textured rooms and repeating geometric patterns, is a challenging problem with high practical relevance for applications such as Augmented Reality and robotics. To handle the ambiguities arising in this scenario, a common strategy is, first, to generate multiple estimates for the camera pose from which a given query image was taken. The pose with the largest geometric consistency with the query image, e.g., in the form of an inlier count, is then selected in a second stage. While a significant amount of research has concentrated on the first stage, there is considerably less work on the second stage. In this paper, we thus focus on pose verification. We show that combining different modalities, namely appearance, geometry, and semantics, considerably boosts pose verification and consequently pose accuracy. We develop multiple hand-crafted as well as a trainable approach to join into the geometric-semantic verification and show significant improvements over stateof-the-art on a very challenging indoor dataset.

show abstract

On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation

Cited by 119 publications

References 42 publications

Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization

Fine-Grained Segmentation Networks: Self-Supervised Segmentation for Improved Long-Term Visual Localization

Let's Take This Online: Adapting Scene Coordinate Regression Network Predictions for Online RGB-D Camera Relocalisation

Is This the Right Place? Geometric-Semantic Pose Verification for Indoor Visual Localization

Contact Info

Product

Resources

About