2022
DOI: 10.48550/arxiv.2209.09050
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Loc-NeRF: Monte Carlo Localization using Neural Radiance Fields

Abstract: We present Loc-NeRF, a real-time vision-based robot localization approach that combines Monte Carlo localization and Neural Radiance Fields (NeRF). Our system uses a pre-trained NeRF model as the map of an environment and can localize itself in real-time using an RGB camera as the only exteroceptive sensor onboard the robot. While neural radiance fields have seen significant applications for visual rendering in computer vision and graphics, they have found limited use in robotics. Existing approaches for NeRF-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 39 publications
0
5
0
Order By: Relevance
“…However, the training process remains lengthy, as DFNet still needs to train NeRF, pose regression, and feature extraction networks separately. Maggio et al proposed a Monte Carlo localization method called Loc-NeRF (Maggio et al 2022), where Loc-NeRF continuously samples candidate poses under the initial pose and uses NeRF to render novel views to find the correct pose direction. However, Loc-NeRF is unstable and still requires an initial camera pose.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the training process remains lengthy, as DFNet still needs to train NeRF, pose regression, and feature extraction networks separately. Maggio et al proposed a Monte Carlo localization method called Loc-NeRF (Maggio et al 2022), where Loc-NeRF continuously samples candidate poses under the initial pose and uses NeRF to render novel views to find the correct pose direction. However, Loc-NeRF is unstable and still requires an initial camera pose.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, from the adaptation module, we can also learn a score for each dense feature for better feature matching and initial localization. Second, the renderingbased optimization may be easily stuck in the local minimum (Maggio et al 2022) due to the backpropagation through the networks and also time-consuming. To improve the neural rendering-based optimization with point-based representation, we further propose a novel efficient rendering-based optimization framework by aligning the rendered image with the query image and minimizing the warping loss function.…”
Section: Introductionmentioning
confidence: 99%
“…NeRF predicts the RGB color and density of a point in a scene so that an image from an arbitrary viewpoint can be rendered. This property enables pose estimation [1,30,31,44] based on the photometric loss between the observed image and the rendered image, or manipulation of tricky objects [5,12,14,25,29]. A pretrained NeRF can also work as a virtual simulator, in which a robot can plan its trajectory [1] or can be used to train an action policy for the real-world [6].…”
Section: Related Workmentioning
confidence: 99%
“…There have been a number of studies that leverage neural radiance fields (NeRF) for various applications other than novel view synthesis. The robotic applications of NeRF are also now beginning to emerge [1,15,31,44,55]. However, many of the approaches require pretrained neural radiance fields about a specific scene and are not generalizable to various scenes.…”
Section: Introductionmentioning
confidence: 99%
“…Scene coordinate regression directly predicts absolute 3D coordinates of image pixels and uses scene structure explicitly to improve accuracy. More recently, many efforts [28] [38][22] [17] have been made to use implicit neural representation to replace explicit 3D models in localization pipeline. Different from the commonly used discrete 3D models like point cloud and voxel grid, NeRF is one of the implicit 3D representations inferred from a sparse set of posed images, which models geometry and visual information in continuous 3D space.…”
Section: Introductionmentioning
confidence: 99%