Robust Monocular Visual Teach and Repeat Aided by Local Ground Planarity and Color‐constant Imagery

Clement, Lee; Kelly, Jonathan; Barfoot, Timothy D.

doi:10.1002/rob.21655

Cited by 25 publications

(19 citation statements)

References 42 publications

(71 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our goal in this work is to learn a nonlinear transformation f : R 3 → R mapping the RGB colorspace onto a grayscale colorspace that explicitly maximizes a chosen performance metric of a vision-based localization pipeline. We investigate two approaches to formulating such a mapping: 1) a single function to be applied as a pre-processing step to all incoming images, similarly to [11], [13], [14]; and 2) a parametrized function tailored to the specific image pair to be used for localization, where the parameters of this function are derived from the images themselves. Additionally, the functional form of either mapping may be specified analytically (e.g., from physics) or learned from data using a function approximator such as a neural network.…”

Section: Learning Matchable Colorspace Transformationsmentioning

confidence: 99%

“…However, in practice, these constraints are relaxed and the parameters (α, β) are tuned to a specific environment, sensor, and feature matcher, where the theoretical values do not perform optimally. Indeed, [11], [14] used two sets of parameters tuned to maximize the stability of SURF features [32] in regions where grassy or sandy materials dominate.…”

Section: B Physically Motivated Transformationsmentioning

confidence: 99%

See 1 more Smart Citation

Learning Matchable Image Transformations for Long-Term Metric Visual Localization

Clement

Gridseth²,

Tomasi

et al. 2020

IEEE Robot. Autom. Lett.

Self Cite

View full text Add to dashboard Cite

Long-term metric self-localization is an essential capability of autonomous mobile robots, but remains challenging for vision-based systems due to appearance changes caused by lighting, weather, or seasonal variations. While experience-based mapping has proven to be an effective technique for bridging the 'appearance gap,' the number of experiences required for reliable metric localization over days or months can be very large, and methods for reducing the necessary number of experiences are needed for this approach to scale. Taking inspiration from color constancy theory, we learn a nonlinear RGB-to-grayscale mapping that explicitly maximizes the number of inlier feature matches for images captured under different lighting and weather conditions, and use it as a pre-processing step in a conventional single-experience localization pipeline to improve its robustness to appearance change. We train this mapping by approximating the target non-differentiable localization pipeline with a deep neural network, and find that incorporating a learned low-dimensional context feature can further improve cross-appearance feature matching. Using synthetic and realworld datasets, we demonstrate substantial improvements in localization performance across day-night cycles, enabling continuous metric localization over a 30-hour period using a single mapping experience, and allowing experience-based localization to scale to long deployments with dramatically reduced data requirements.

show abstract

Section: Learning Matchable Colorspace Transformationsmentioning

confidence: 99%

Section: B Physically Motivated Transformationsmentioning

confidence: 99%

Learning Matchable Image Transformations for Long-Term Metric Visual Localization

Clement

Gridseth²,

Tomasi

et al. 2020

IEEE Robot. Autom. Lett.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Illumniation robustness in visual localization has been previously studied from the perspective of illumination invariance, with methods such as [12]- [14] making use of hand-crafted image transformations to improve feature matching over time. Similarly, affine models [2] and other analytical transformations [15] have been used to improve the robustness of direct visual localization to illumination change.…”

Section: Related Workmentioning

confidence: 99%

“…Our direct localization pipeline operates in both mapping (VO) and relocalization modes in a similar vein to topometric visual teach-and-repeat navigation [13], [14], where the camera follows a similar trajectory during both mapping and relocalization phases. As the camera explores the environment in mapping mode, we generate a list of posed keyframes with corresponding image and depth data, creating new keyframes when the translational or rotational distance between the most recent keyframe pose and the current tracking pose exceeds a preset threshold.…”

Section: Keyframe Mapping and Relocalizationmentioning

confidence: 99%

“…To this end, we investigated the use of our trained CAT models for teach-and-repeat-style metric relocalization [13], [14] by first creating a map in the canonical condition, then relocalizing against it in different conditions using both the original and transformed images. Figure 6 shows sample relocalization errors for the "Morning" and "Sunset" conditions of the VKITTI/0001 trajectory using both the original image streams (resized and cropped to 256 × 192 resolution) and the output of our trained model, while Table II summarizes our relocalization results for all ETHL/syn and VKITTI sequences.…”

Section: B Visual Relocalizationmentioning

confidence: 99%

See 1 more Smart Citation

How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change

Clement

Kelly

2018

IEEE Robot. Autom. Lett.

Self Cite

View full text Add to dashboard Cite

Abstract-Direct visual localization has recently enjoyed a resurgence in popularity with the increasing availability of cheap mobile computing power. The competitive accuracy and robustness of these algorithms compared to state-of-the-art feature-based methods, as well as their natural ability to yield dense maps, makes them an appealing choice for a variety of mobile robotics applications. However, direct methods remain brittle in the face of appearance change due to their underlying assumption of photometric consistency, which is commonly violated in practice. In this paper, we propose to mitigate this problem by training deep convolutional encoder-decoder models to transform images of a scene such that they correspond to a previously-seen canonical appearance. We validate our method in multiple environments and illumination conditions using high-fidelity synthetic RGB-D datasets, and integrate the trained models into a direct visual localization pipeline, yielding improvements in visual odometry (VO) accuracy through timevarying illumination conditions, as well as improved metric relocalization performance under illumination change, where conventional methods normally fail. We further provide a preliminary investigation of transfer learning from synthetic to real environments in a localization context.

show abstract

Appearance‐based landmark selection for visual localization

Bürki

Cadena

Gilitschenski³

et al. 2019

Journal of Field Robotics

View full text Add to dashboard Cite

show abstract

Robust Monocular Visual Teach and Repeat Aided by Local Ground Planarity and Color‐constant Imagery

Cited by 25 publications

References 42 publications

Learning Matchable Image Transformations for Long-Term Metric Visual Localization

Learning Matchable Image Transformations for Long-Term Metric Visual Localization

How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change

Appearance‐based landmark selection for visual localization

Contact Info

Product

Resources

About