Fast semi-dense 3D semantic mapping with monocular visual SLAM

Li, Xuanpeng; Ao, Huanxuan; Belaroussi, Rachid; Gruyer, Dominique

doi:10.1109/itsc.2017.8317942

Cited by 52 publications

(54 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…There has been a great interest from the computer vision and robotics communities to exploit object-level information since from the perspective of many applications, it is beneficial to explore the awareness that object instances can provide for assistive computer vision [7,8,9], tracking/SLAM [10,11], or place categorization/scene recognition and life-long mapping [12,13].…”

Section: Related Workmentioning

confidence: 99%

“…The segmented labels are then projected/registered into the 3D reconstructed point cloud. Similarly, Li and Belaroussi [10] provided a 3D semantic mapping system from monocular images. Their methodology is based on LSD-SLAM [32], which estimates a semi-dense 3D reconstruction of the scene and performs camera localization from monocular images.…”

Section: Slam and Augmented Semantic Representationsmentioning

confidence: 99%

See 1 more Smart Citation

Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework Using Visual and Depth Cues

Martins

Bersan

Campos

et al. 2020

J Intell Robot Syst

View full text Add to dashboard Cite

This paper addresses the problem of building augmented metric representations of scenes with semantic information from RGB-D images. We propose a complete framework to create an enhanced map representation of the environment with object-level information to be used in several applications such as human-robot interaction, assistive robotics, visual navigation, or in manipulation tasks. Our formulation leverages a CNN-based object detector (Yolo) with a 3D model-based segmentation technique to perform instance semantic segmentation, and to localize, identify, and track different classes of objects in the scene. The tracking and positioning of semantic classes is done with a dictionary of Kalman filters in order to combine sensor measurements over time and then providing more accurate maps. The formulation is designed to identify and to disregard dynamic objects in order to obtain a mediumterm invariant map representation. The proposed method was evaluated with collected and publicly available RGB-D data sequences acquired in different indoor scenes. Experimental results show the potential of the technique to produce augmented semantic maps containing several objects (notably doors). We also provide to the community a dataset composed of annotated object classes (doors, fire extinguishers, benches, water fountains) and their positioning, as well as the source code as ROS packages. 1 1 Preprint paper version to appear at Journal of Intelligent & Robotic Systems, available online at: https://doi.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Slam and Augmented Semantic Representationsmentioning

confidence: 99%

Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework Using Visual and Depth Cues

Martins

Bersan

Campos

et al. 2020

J Intell Robot Syst

View full text Add to dashboard Cite

show abstract

“…Cheng et al [31] applied ORB-SLAM to get real-scale 3D visual maps and CRF-RNN algorithm for semantic segmentation. In [32], this challenge was solved by combining the stateof-the-art deep learning algorithms and semi-dense SLAM based on a monocular camera. 2D semantic information are transferred to 3D mapping via correspondence between connective Keyframes with spatial consistency.…”

Section: Related Workmentioning

confidence: 99%

Visual-Based Semantic SLAM with Landmarks for Large-Scale Outdoor Environment

Zhao

Mao

Ding

et al. 2019

2019 2nd China Symposium on Cognitive Computing and Hybrid Intelligence (CCHI)

View full text Add to dashboard Cite

Semantic SLAM is an important field in autonomous driving and intelligent agents, which can enable robots to achieve high-level navigation tasks, obtain simple cognition or reasoning ability and achieve language-based human-robot-interaction. In this paper, we built a system to creat a semantic 3D map by combining 3D point cloud from ORB SLAM [1], [2] with semantic segmentation information from Convolutional Neural Network model PSPNet-101 [3] for large-scale environments. Besides, a new dataset for KITTI [4] sequences has been built, which contains the GPS information and labels of landmarks from Google Map in related streets of the sequences. Moreover, we find a way to associate the real-world landmark with point cloud map and built a topological map based on semantic map.

show abstract

“…Their main contribution was an efficient spatial regularizing Conditional Random Field (CRF), which smoothes semantic labels throughout the point cloud. Li and Belaroussi (2016) extended this approach to monocular video while using the semi-dense map of LSD-SLAM (Engel et al, 2014). Here, the DeepLab-CNN (Chen et al, 2018) was used instead of a random forest for segmentation.…”

Section: Figmentioning

confidence: 99%

Semi-supervised Semantic Mapping Through Label Propagation with Semantic Texture Meshes

2019

View full text Add to dashboard Cite

Scene understanding is an important capability for robots acting in unstructured environments. While most SLAM approaches provide a geometrical representation of the scene, a semantic map is necessary for more complex interactions with the surroundings. Current methods treat the semantic map as part of the geometry which limits scalability and accuracy. We propose to represent the semantic map as a geometrical mesh and a semantic texture coupled at independent resolution. The key idea is that in many environments the geometry can be greatly simplified without loosing fidelity, while semantic information can be stored at a higher resolution, independent of the mesh. We construct a mesh from depth sensors to represent the scene geometry and fuse information into the semantic texture from segmentations of individual RGB views of the scene. Making the semantics persistent in a global mesh enables us to enforce temporal and spatial consistency of the individual view predictions. For this, we propose an efficient method of establishing consensus between individual segmentations by iteratively retraining semantic segmentation with the information stored within the map and using the retrained segmentation to re-fuse the semantics. We demonstrate the accuracy and scalability of our approach by reconstructing semantic maps of scenes from NYUv2 and a scene spanning large buildings. Fig. 1 Semantic Reconstruction: We generate a mesh with RGB texture and semantic annotations. The mesh enables us to ensure temporal and spatial consistency between semantic predictions and allows us to perform label propagation for improved semantic segmentation. Color coding of semantic labels correspond to NYUv2 dataset (Silberman et al., 2012).

show abstract

Fast semi-dense 3D semantic mapping with monocular visual SLAM

Cited by 52 publications

References 19 publications

Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework Using Visual and Depth Cues

Extending Maps with Semantic and Contextual Object Information for Robot Navigation: a Learning-Based Framework Using Visual and Depth Cues

Visual-Based Semantic SLAM with Landmarks for Large-Scale Outdoor Environment

Semi-supervised Semantic Mapping Through Label Propagation with Semantic Texture Meshes

Contact Info

Product

Resources

About