Self-Supervised Model Adaptation for Multimodal Semantic Segmentation

Valada, Abhinav; Mohan, Renjith; Burgard, Wolfram

doi:10.1007/s11263-019-01188-y

Cited by 200 publications

(157 citation statements)

References 62 publications

Supporting

Mentioning

139

Contrasting

Order By: Relevance

“…In fact, different fusion techniques can be considered: early fusion, late fusion or multi-level fusion. Valada et al [19] adopt the latter technique by extracting and combining feature maps at different stages in the encoder from multiple input streams. In general most works, such as [7,2], show that late fusion can achieve better performance.…”

Section: Fusionmentioning

confidence: 99%

Sparse and Noisy LiDAR Completion with RGB Guidance and Uncertainty

Gansbeke

Neven

Brabandere

et al. 2019

2019 16th International Conference on Machine Vision Applications (MVA)

233

187

View full text Add to dashboard Cite

This work proposes a new method to accurately complete sparse LiDAR maps guided by RGB images. For autonomous vehicles and robotics the use of LiDAR is indispensable in order to achieve precise depth predictions. A multitude of applications depend on the awareness of their surroundings, and use depth cues to reason and react accordingly. On the one hand, monocular depth prediction methods fail to generate absolute and precise depth maps. On the other hand, stereoscopic approaches are still significantly outperformed by LiDAR based approaches. The goal of the depth completion task is to generate dense depth predictions from sparse and irregular point clouds which are mapped to a 2D plane. We propose a new framework which extracts both global and local information in order to produce proper depth maps. We argue that simple depth completion does not require a deep network. However, we additionally propose a fusion method with RGB guidance from a monocular camera in order to leverage object information and to correct mistakes in the sparse input. This improves the accuracy significantly. Moreover, confidence masks are exploited in order to take into account the uncertainty in the depth predictions from each modality. This fusion method outperforms the state-of-the-art and ranks first on the KITTI depth completion benchmark [21]. Our code with visualizations is available at https: // github. com/ wvangansbeke/ Sparse-Depth-Completion .

show abstract

Section: Fusionmentioning

confidence: 99%

Sparse and Noisy LiDAR Completion with RGB Guidance and Uncertainty

Gansbeke

Neven

Brabandere

et al. 2019

2019 16th International Conference on Machine Vision Applications (MVA)

233

187

View full text Add to dashboard Cite

show abstract

“…Cityscapes data Additional data mIoU (%) Runtime (s) DRN_CRL_Coarse [37] Fine, Coarse ImagNet 82.8 -DPC [3] Fine, Coarse ImageNet, COCO [17] 82.7 -RelationNet_Coarse [36] Fine, Coarse ImageNet 82.4 -SSMA [32] Fine…”

Section: High Accuracy Network Methodsmentioning

confidence: 99%

DSNet: An Efficient CNN for Road Scene Segmentation

Chen

Hang

Chan

et al. 2019

2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

View full text Add to dashboard Cite

Road scene understanding is a critical component in an autonomous driving system. Although the deep learningbased road scene segmentation can achieve very high accuracy, its complexity is also very high for developing real-time applications. It is challenging to design a neural net with high accuracy and low computational complexity. To address this issue, we investigate the advantages and disadvantages of several popular CNN architectures in terms of speed, storage and segmentation accuracy. We start from the Fully Convolutional Network (FCN) with VGG, and then we study ResNet and DenseNet. Through detailed experiments, we pick up the favorable components from the existing architectures and at the end, we construct a light-weight network architecture based on the DenseNet. Our proposed network, called DSNet, demonstrates a realtime testing (inferencing) ability (on the popular GPU platform) and it maintains an accuracy comparable with most previous systems. We test our system on several datasets including the challenging Cityscapes dataset (resolution of 1024 × 512) with an mIoU of about 69.1 % and runtime of 0.0147 second per image on a single GTX 1080Ti. We also design a more accurate model but at the price of a slower speed, which has an mIoU of about 72.6 % on the CamVid dataset.

show abstract

“…ICNet [28] achieves great balance between efficiency and accuracy by using a hierarchical structure to save time on high-resolution feature maps. As regards RGB-D semantic segmentation, some studies have tried to utilizing the depth information to achieve better segmentation accuracy [9,13,18,25]. Hazirbas [9] presented a fusion-based CNN architecture which is consisted of two encoder branches for RGB and depth channel.…”

Section: Semantic Segmentationmentioning

confidence: 99%

Small Obstacle Avoidance Based on RGB-D Semantic Segmentation

Hua¹,

Yang²,

Lian³

2019

2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)

View full text Add to dashboard Cite

This paper presents a novel obstacle avoidance system for road robots equipped with RGB-D sensor that captures scenes of its way forward. The purpose of the system is to have road robots move around autonomously and constantly without any collision even with small obstacles, which are often missed by existing solutions. For each input RGB-D image, the system uses a new two-stage semantic segmentation network followed by the morphological processing to generate the accurate semantic map containing road and obstacles. Based on the map, the local path planning is applied to avoid possible collision. Additionally, optical flow supervision and motion blurring augmented training scheme is applied to improve temporal consistency between adjacent frames and overcome the disturbance caused by camera shake. Various experiments are conducted to show that the proposed architecture obtains high performance both in indoor and outdoor scenarios.

show abstract

Self-Supervised Model Adaptation for Multimodal Semantic Segmentation

Cited by 200 publications

References 62 publications

Sparse and Noisy LiDAR Completion with RGB Guidance and Uncertainty

Sparse and Noisy LiDAR Completion with RGB Guidance and Uncertainty

DSNet: An Efficient CNN for Road Scene Segmentation

Small Obstacle Avoidance Based on RGB-D Semantic Segmentation

Contact Info

Product

Resources

About