2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00342
|View full text |Cite
|
Sign up to set email alerts
|

Understanding the Limitations of CNN-Based Absolute Camera Pose Regression

Abstract: Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including selfdriving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

8
270
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 359 publications
(311 citation statements)
references
References 81 publications
8
270
0
Order By: Relevance
“…While previous image-based geolocalization methods have been primarily based on ground-to-ground (streetview level) image matching (e.g., [5,24,19,18,20,27]), using ground-to-aerial cross-view matching (i.e., matching ground-level photos to aerial imagery ) is becoming an attractive approach for image-based localization, thanks to the widespread coverage (over the entire Earth) and easy accessibility of satellite and aerial survey images. In contrast, the coverage of ground-level street views (such as Google-map or Bing-map) is at best limited to urban areas.…”
Section: Introductionmentioning
confidence: 99%
“…While previous image-based geolocalization methods have been primarily based on ground-to-ground (streetview level) image matching (e.g., [5,24,19,18,20,27]), using ground-to-aerial cross-view matching (i.e., matching ground-level photos to aerial imagery ) is becoming an attractive approach for image-based localization, thanks to the widespread coverage (over the entire Earth) and easy accessibility of satellite and aerial survey images. In contrast, the coverage of ground-level street views (such as Google-map or Bing-map) is at best limited to urban areas.…”
Section: Introductionmentioning
confidence: 99%
“…Like local matching methods, they also generalise well from novel poses. However, whilst they tend to be more accurate than local matching methods at small/medium scale, they have not yet been shown to scale well to very large scenes [60].…”
Section: Back-project Pointsmentioning
confidence: 98%
“…While such end-toend methods have demonstrated some success, they have not achieved similar accuracy as geometry-based solutions (e.g., those that optimise pose from a correspondence set). Moreover, recent work [29] suggests that "absolute pose regression approaches are more closely related to approximate pose estimation via image retrieval", thus they may not generalise well in practice.…”
Section: Monocular Pose Estimationmentioning
confidence: 99%