2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
DOI: 10.1109/cvpr.2017.31
|View full text |Cite
|
Sign up to set email alerts
|

On-the-Fly Adaptation of Regression Forests for Online Camera Relocalisation

Abstract: Camera relocalisation is an important problem in computer vision, with applications in simultaneous localisation and mapping, virtual/augmented reality and navigation. Common techniques either match the current image against keyframes with known poses coming from a tracker, or establish 2D-to-3D correspondences between keypoints in the current image and points in the scene in order to estimate the camera pose. Recently, regression forests have become a popular alternative to establish such correspondences. The… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
150
0

Year Published

2018
2018
2020
2020

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 119 publications
(150 citation statements)
references
References 42 publications
(103 reference statements)
0
150
0
Order By: Relevance
“…Machine learningbased approaches either replace the 2D-3D matching stage through scene coordinate regression [10,12,16,[52][53][54]79], i.e., they regress the 3D point coordinate in each 2D-3D match, or directly regress the camera pose from an image [8,13,35,36,89]. The former type of methods achieves state-of-the-art localization accuracy in smallscale scenes [12,16,53], but do not seem to easily scale to larger scenes [12]. The latter type of methods have recently been shown to not perform consistently better than image retrieval methods [76], i.e., approaches that approximate the pose of the query image by the pose of the most similar database image [3,38,87].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Machine learningbased approaches either replace the 2D-3D matching stage through scene coordinate regression [10,12,16,[52][53][54]79], i.e., they regress the 3D point coordinate in each 2D-3D match, or directly regress the camera pose from an image [8,13,35,36,89]. The former type of methods achieves state-of-the-art localization accuracy in smallscale scenes [12,16,53], but do not seem to easily scale to larger scenes [12]. The latter type of methods have recently been shown to not perform consistently better than image retrieval methods [76], i.e., approaches that approximate the pose of the query image by the pose of the most similar database image [3,38,87].…”
Section: Related Workmentioning
confidence: 99%
“…The scene representation used by localization algorithms is typically recovered from images depicting a given scene. The type of representation can vary from a set of images with associated camera poses [8,75,98], over 3D models constructed from Structure-from-Motion [77,81], to weights encoded in convolutional neural networks (CNNs) [8,10,12,13,35,36,52] or random forests [11,16,79]. In practice, capturing a scene from all possible view-…”
Section: Introductionmentioning
confidence: 99%
“…One online local regression approach is that of [13,12], which showed how to adapt the regression forests of [63] for online use in real time. Their approach achieves stateof-the-art performance on the popular 7-Scenes [63] and Stanford 4 Scenes [68] indoor datasets, and also performs well on some of the easier outdoor scenes from Cambridge Landmarks [36,34,35].…”
Section: Back-project Pointsmentioning
confidence: 99%
“…Indeed, the broader trend in machine learning has been towards replacing models such as regression forests with neural networks that can learn suitable features, rather than trying to hand-craft them manually. However, replacing the forests used by [13,12] with networks is not straightforward. To achieve online relocalisation, they rely on the way in which their forests predict leaves containing reservoirs of points to adapt forests between scenes, and it is tricky to see how this scheme can be easily transferred to work with local regression networks, which tend to directly predict individual points in the training scene.…”
Section: Back-project Pointsmentioning
confidence: 99%
“…State-of-the-art approaches for accurate visual localization are based on matches between 2D image and 3D scene coordinates [10,11,16,40,46,47,56,71,72,84]. These 2D-3D matches are either established using explicit feature matching [40,56,71,72,84] or via learning-based scene co- Figure 1: Using further modalities for indoor visual localization.…”
Section: Introductionmentioning
confidence: 99%