2016
DOI: 10.1002/rob.21655
|View full text |Cite
|
Sign up to set email alerts
|

Robust Monocular Visual Teach and Repeat Aided by Local Ground Planarity and Color‐constant Imagery

Abstract: Visual Teach and Repeat (VT&R) allows an autonomous vehicle to accurately repeat a previously traversed route using only vision sensors. Most VT&R systems rely on natively three‐dimensional (3D) sensors such as stereo cameras for mapping and localization, but many existing mobile robots are equipped with only 2D monocular vision, typically for teleoperation. In this paper, we extend VT&R to the most basic sensor configuration—a single monocular camera. We show that kilometer‐scale route repetition can be achie… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
18
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 25 publications
(19 citation statements)
references
References 42 publications
(71 reference statements)
1
18
0
Order By: Relevance
“…Our goal in this work is to learn a nonlinear transformation f : R 3 → R mapping the RGB colorspace onto a grayscale colorspace that explicitly maximizes a chosen performance metric of a vision-based localization pipeline. We investigate two approaches to formulating such a mapping: 1) a single function to be applied as a pre-processing step to all incoming images, similarly to [11], [13], [14]; and 2) a parametrized function tailored to the specific image pair to be used for localization, where the parameters of this function are derived from the images themselves. Additionally, the functional form of either mapping may be specified analytically (e.g., from physics) or learned from data using a function approximator such as a neural network.…”
Section: Learning Matchable Colorspace Transformationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our goal in this work is to learn a nonlinear transformation f : R 3 → R mapping the RGB colorspace onto a grayscale colorspace that explicitly maximizes a chosen performance metric of a vision-based localization pipeline. We investigate two approaches to formulating such a mapping: 1) a single function to be applied as a pre-processing step to all incoming images, similarly to [11], [13], [14]; and 2) a parametrized function tailored to the specific image pair to be used for localization, where the parameters of this function are derived from the images themselves. Additionally, the functional form of either mapping may be specified analytically (e.g., from physics) or learned from data using a function approximator such as a neural network.…”
Section: Learning Matchable Colorspace Transformationsmentioning
confidence: 99%
“…However, in practice, these constraints are relaxed and the parameters (α, β) are tuned to a specific environment, sensor, and feature matcher, where the theoretical values do not perform optimally. Indeed, [11], [14] used two sets of parameters tuned to maximize the stability of SURF features [32] in regions where grassy or sandy materials dominate.…”
Section: B Physically Motivated Transformationsmentioning
confidence: 99%
“…Illumniation robustness in visual localization has been previously studied from the perspective of illumination invariance, with methods such as [12]- [14] making use of hand-crafted image transformations to improve feature matching over time. Similarly, affine models [2] and other analytical transformations [15] have been used to improve the robustness of direct visual localization to illumination change.…”
Section: Related Workmentioning
confidence: 99%
“…Our direct localization pipeline operates in both mapping (VO) and relocalization modes in a similar vein to topometric visual teach-and-repeat navigation [13], [14], where the camera follows a similar trajectory during both mapping and relocalization phases. As the camera explores the environment in mapping mode, we generate a list of posed keyframes with corresponding image and depth data, creating new keyframes when the translational or rotational distance between the most recent keyframe pose and the current tracking pose exceeds a preset threshold.…”
Section: Keyframe Mapping and Relocalizationmentioning
confidence: 99%
See 1 more Smart Citation