Efficient Uncertainty Estimation for Semantic Segmentation in Videos

Huang, Po‐Yu; Hsu, Wan Ting; Chiu, Chun-Yueh; Wu, Tingfan; Sun, Min

doi:10.1007/978-3-030-01246-5_32

Cited by 100 publications

(47 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In the inference phase, we use the training dataset and validation dataset to train our model with 960 × 720 resolution input. Our models are compared to some non-real-time algorithms, including SegNet (Badrinarayanan et al, 2017), Deeplab (Chen et al, 2015), RTA (Huang et al, 2018), Dilate8 (Yu and Koltun, 2016), PSPNet (Zhao et al, 2017), VideoGCRF (Chandra et al, 2018), and DenseDecoder (Bilinski and Prisacariu, 2018), and real-time algorithms, containing ENet (Paszke et al, 2016), IC-Net (Zhao et al, 2018a), DABNet (Li et al, 2019a), DFANet (Li et al, 2019b), SwiftNet (Orsic et al, 2019), BiSeNetV1 (Yu et al, 2018a). BiSeNetV2 achieves much faster inference speed than other methods.…”

Section: Performance Evaluationmentioning

confidence: 99%

BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation

Wang

Peng³

et al. 2018

Lecture Notes in Computer Science

1,846

1,263

View full text Add to dashboard Cite

The low-level details and high-level semantics are both essential to the semantic segmentation task. However, to speed up the model inference, current approaches almost always sacrifice the low-level details, which leads to a considerable accuracy decrease. We propose to treat these spatial details and categorical semantics separately to achieve high accuracy and high efficiency for real-time semantic segmentation. To this end, we propose an efficient and effective architecture with a good trade-off between speed and accuracy, termed Bilateral Segmentation Network (BiSeNet V2). This architecture involves: (i) a Detail Branch, with wide channels and shallow layers to capture low-level details and generate high-resolution feature representation; (ii) a Semantic Branch, with narrow channels and deep layers to obtain high-level semantic context. The Semantic Branch is lightweight due to reducing the channel capacity and a fast-downsampling strategy. Furthermore, we design a Guided Aggregation Layer to enhance mutual connections and fuse both types of feature representation. Besides, a booster training strategy is designed to improve the segmentation performance without any extra inference cost. Extensive quantitative and qualitative evaluations demonstrate that the pro-posed architecture performs favourably against a few state-of-the-art real-time semantic segmentation approaches. Specifically, for a 2,048×1,024 input, we achieve 72.6% Mean IoU on the Cityscapes test set with a speed of 156 FPS on one NVIDIA GeForce GTX 1080 Ti card, which is significantly faster than existing methods, yet we achieve better segmentation accuracy. Code and trained models will be made publicly available.

show abstract

Section: Performance Evaluationmentioning

confidence: 99%

BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation

Wang

Peng³

et al. 2018

Lecture Notes in Computer Science

1,846

1,263

View full text Add to dashboard Cite

show abstract

“…To this end, most methods employ the popular sampling-based Monte Carlo (MC) dropout technique [10] and Bayesian neural networks (BNNs). For example, methods such as [18], [19], and [20], [21] employ modified versions of MC dropout to predict per pixel semantic and bounding box regression uncertainties, respectively.…”

Section: B Uncertainty Estimationmentioning

confidence: 99%

Robust Monocular Localization in Sparse HD Maps Leveraging Multi-Task Uncertainty Estimation

Kürsat¹,

Sirohi²,

Büscher³

et al. 2021

Preprint

View full text Add to dashboard Cite

Robust localization in dense urban scenarios using a low-cost sensor setup and sparse HD maps is highly relevant for the current advances in autonomous driving, but remains a challenging topic in research. We present a novel monocular localization approach based on a sliding-window pose graph that leverages predicted uncertainties for increased precision and robustness against challenging scenarios and perframe failures. To this end, we propose an efficient multi-task uncertainty-aware perception module, which covers semantic segmentation, as well as bounding box detection, to enable the localization of vehicles in sparse maps, containing only lane borders and traffic lights. Further, we design differentiable cost maps that are directly generated from the estimated uncertainties. This opens up the possibility to minimize the reprojection loss of amorphous map elements in an associationfree and uncertainty-aware manner. Extensive evaluation on the Lyft 5 dataset shows that, despite the sparsity of the map, our approach enables robust and accurate 6D localization in challenging urban scenarios using only monocular camera images and vehicle odometry.

show abstract

“…There are some approaches to uncertainty modelling for deep learning proposed, but most of them need to sample several times, which is destructive to bi-temporal applications (Huang et al, 2018). In this paper, we focus on Monte Carlo dropout (Gal and Ghahramani, 2016) for uncertainty modelling in building change detection.…”

Section: Uncertainty Modellingmentioning

confidence: 99%

Deep Few-Shot Learning for Bi-Temporal Building Change Detection

Khoshboresh-Masouleh

Shah-Hosseini

2021

Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci.

View full text Add to dashboard Cite

Abstract. In real-world applications (e.g., change detection), annotating images is very expensive. To build effective deep learning models in these applications, deep few-shot learning methods have been developed and prove to be a robust approach in small training data. The study of building change detection from high spatial resolution satellite observations is important to research in remote sensing, photogrammetry, and computer vision nowadays, which can be widely used in a variety of real-world applications, such as map generation and updating. As manual high-resolution image interpretation is expensive and time-consuming, building change detection methods are of high interest. The interest in developing building change detection approaches from optical remote sensing images is rapidly increasing due to larger coverages, and lower costs of optical images. In this study, we focus on building change detection analysis on a small set of building changes from different regions that sit in several cities. In this paper, a new deep few-shot learning method is proposed for building change detection using Monte Carlo dropout and remote sensing observations. The setup is based on a small dataset, including bitemporal optical images labelled for building change detection.

show abstract

Efficient Uncertainty Estimation for Semantic Segmentation in Videos

Cited by 100 publications

References 30 publications

BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation

BiSeNet: Bilateral Segmentation Network for Real-Time Semantic Segmentation

Robust Monocular Localization in Sparse HD Maps Leveraging Multi-Task Uncertainty Estimation

Deep Few-Shot Learning for Bi-Temporal Building Change Detection

Contact Info

Product

Resources

About