CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction

Tateno, Keisuke; Tombari, Federico; Laina, Iro; Navab, Nassir

doi:10.1109/cvpr.2017.695

Cited by 656 publications

(448 citation statements)

References 28 publications

Supporting

Mentioning

407

Contrasting

Unclassified

Order By: Relevance

“…The photometric factor is treated as a baseline system and the other two types are included both individually and together. We perform this evaluation on selected shortened scenes from the validation set of the ScanNet dataset and use three error metrics: RMSE of the Absolute Trajectory Error (ATE-RMSE), the absolute relative depth difference (absrel) [13] and the average percentage of pixels in which the estimated depth falls within 10% of the true value (pc110) [18]. The results are presented in Table I.…”

Section: B Ablation Studiesmentioning

confidence: 99%

“…Ours CNN-SLAM [18] LSD-BS [19] Laina [37] icl/office0 [24]. CNN-SLAM was run without pose graph optimisation and our system without loop closures.…”

Section: Sequencementioning

confidence: 99%

“…An example of other directly comparable work is CNN-SLAM [18]. Although with a substantially different principle of operation, the authors present a real-time full SLAM system that builds a large scale map and supports loop closures by utilising LSD-SLAM [19] and incorporating learned priors.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

Czarnowski

Laidlow

Clark

et al. 2020

IEEE Robot. Autom. Lett.

116

View full text Add to dashboard Cite

The ability to estimate rich geometry and camera motion from monocular imagery is fundamental to future interactive robotics and augmented reality applications. Different approaches have been proposed that vary in scene geometry representation (sparse landmarks, dense maps), the consistency metric used for optimising the multi-view problem, and the use of learned priors. We present a SLAM system that unifies these methods in a probabilistic framework while still maintaining real-time performance. This is achieved through the use of a learned compact depth map representation and reformulating three different types of errors: photometric, reprojection and geometric, which we make use of within standard factor graph software. We evaluate our system on trajectory estimation and depth reconstruction on real-world sequences and present various examples of estimated dense geometry.

show abstract

Section: B Ablation Studiesmentioning

confidence: 99%

“…Ours CNN-SLAM [18] LSD-BS [19] Laina [37] icl/office0 [24]. CNN-SLAM was run without pose graph optimisation and our system without loop closures.…”

Section: Sequencementioning

confidence: 99%

See 1 more Smart Citation

DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

Czarnowski

Laidlow

Clark

et al. 2020

IEEE Robot. Autom. Lett.

116

View full text Add to dashboard Cite

show abstract

“…Combined with SLAM systems, 2D semantic segmentation can be achieved in 3D environments [RA17] [TTLN17] [ZSS17] [MHDL17], a promising future in robotic vision understanding and autonomous driving. Unlike these existing methods that aimed at providing the semantic understanding of the scene for robots, we are focusing our attention on human interactions.…”

Section: Previous Workmentioning

confidence: 99%

Context‐Aware Mixed Reality: A Learning‐Based Framework for Semantic‐Level Interaction

Chen

Tang

John

et al. 2019

Computer Graphics Forum

View full text Add to dashboard Cite

Mixed reality (MR) is a powerful interactive technology for new types of user experience. We present a semantic‐based interactive MR framework that is beyond current geometry‐based approaches, offering a step change in generating high‐level context‐aware interactions. Our key insight is that by building semantic understanding in MR, we can develop a system that not only greatly enhances user experience through object‐specific behaviours, but also it paves the way for solving complex interaction design challenges. In this paper, our proposed framework generates semantic properties of the real‐world environment through a dense scene reconstruction and deep image understanding scheme. We demonstrate our approach by developing a material‐aware prototype system for context‐aware physical interactions between the real and virtual objects. Quantitative and qualitative evaluation results show that the framework delivers accurate and consistent semantic information in an interactive MR environment, providing effective real‐time semantic‐level interactions.

show abstract

“…Simultaneous Localization and Mapping (SLAM) is the process of building the map of an unknown environment and determining the location of a robot concurrently using this map. Recently introduced SLAM algorithms have objectoriented design [63], [64], [65], [66], [67], [68], [69], [70], [71]. Accurate object pose parameters provide camera-object constraints and lead to better performance on localization and mapping.…”

Section: Introductionmentioning

confidence: 99%

A review on object pose recovery: From 3D bounding box detectors to full 6D pose estimators

Şahin

Garcia-Hernando

Sock

et al. 2020

Image and Vision Computing

View full text Add to dashboard Cite

Object pose recovery has gained increasing attention in the computer vision field as it has become an important problem in rapidly evolving technological areas related to autonomous driving, robotics, and augmented reality. Existing review-related studies have addressed the problem at visual level in 2D, going through the methods which produce 2D bounding boxes of objects of interest in RGB images. The 2D search space is enlarged either using the geometry information available in the 3D space along with RGB (Mono/Stereo) images, or utilizing depth data from LIDAR sensors and/or RGB-D cameras. 3D bounding box detectors, producing category-level amodal 3D bounding boxes, are evaluated on gravity aligned images, while full 6D object pose estimators are mostly tested at instance-level on the images where the alignment constraint is removed. Recently, 6D object pose estimation is tackled at the level of categories. In this paper, we present the first comprehensive and most recent review of the methods on object pose recovery, from 3D bounding box detectors to full 6D pose estimators. The methods mathematically model the problem as a classification, regression, classification & regression, template matching, and point-pair feature matching task. Based on this, a mathematical-model-based categorization of the methods is established. Datasets used for evaluating the methods are investigated with respect to the challenges, and evaluation metrics are studied. Quantitative results of experiments in the literature are analysed to show which category of methods best performs across what types of challenges. The analyses are further extended comparing two methods, which are our own implementations, so that the outcomes from the public results are further solidified. Current position of the field is summarized regarding object pose recovery, and possible research directions are identified.

show abstract

CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction

Cited by 656 publications

References 28 publications

DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

DeepFactors: Real-Time Probabilistic Dense Monocular SLAM

Context‐Aware Mixed Reality: A Learning‐Based Framework for Semantic‐Level Interaction

A review on object pose recovery: From 3D bounding box detectors to full 6D pose estimators

Contact Info

Product

Resources

About