Pixel-Level Encoding and Depth Layering for Instance-Level Semantic Labeling

Uhrig, Jonas; Cordts, Marius; Franke, Uwe; Brox, Thomas

doi:10.1007/978-3-319-45886-1_2

Cited by 161 publications

(167 citation statements)

References 37 publications

(102 reference statements)

Supporting

Mentioning

165

Contrasting

Order By: Relevance

“…Here, cues from an external object detector are fused with a semantic segmentation output using a Conditional Random Field (CRF) in order to segment the semantic segmentation output into instances. In earlier work, methods like InstanceCut [17] and the work by Uhrig et al [18] solved the same task with single unified networks, also relying on postprocessing steps to split semantic segmentation predictions into instances. However, they are outperformed by DIN.…”

Section: Jsis-netmentioning

confidence: 99%

Fast Panoptic Segmentation Network

Geus

Meletis

Dubbelman

2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

In this work, we present an end-to-end network for fast panoptic segmentation. This network, called Fast Panoptic Segmentation Network (FPSNet), does not require computationally costly instance mask predictions or merging heuristics. This is achieved by casting the panoptic task into a custom dense pixel-wise classification task, which assigns a class label or an instance id to each pixel. We evaluate FPSNet on the Cityscapes and Pascal VOC datasets, and find that FPSNet is faster than existing panoptic segmentation methods, while achieving better or similar panoptic segmentation performance. On the Cityscapes validation set, we achieve a Panoptic Quality score of 55.1%, at prediction times of 114 milliseconds for images with a resolution of 1024x2048 pixels. For lower resolutions of the Cityscapes dataset and for the Pascal VOC dataset, FPSNet runs at 22 and 35 frames per second, respectively.

show abstract

Section: Jsis-netmentioning

confidence: 99%

Fast Panoptic Segmentation Network

Geus

Meletis

Dubbelman

2020

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

show abstract

“…There has also been work on joint learning of semantic segmentation and object detection. In [16], the authors describe an approach to instance segmentation using multi-task learning. For each pixel they predict the class label, depth and the direction to the next instance center using a single neural network.…”

Section: Related Workmentioning

confidence: 99%

Simultaneous Object Detection and Semantic Segmentation

Salscheider

2020

Proceedings of the 9th International Conference on Pattern Recognition Applications and Methods

View full text Add to dashboard Cite

Both object detection in and semantic segmentation of camera images are important tasks for automated vehicles. Object detection is necessary so that the planning and behavior modules can reason about other road users. Semantic segmentation provides for example free space information and information about static and dynamic parts of the environment. There has been a lot of research to solve both tasks using Convolutional Neural Networks. These approaches give good results but are computationally demanding. In practice, a compromise has to be found between detection performance, detection quality and the number of tasks. Otherwise it is not possible to meet the real-time requirements of automated vehicles. In this work, we propose a neural network architecture to solve both tasks simultaneously. This architecture was designed to run with around 10 Hz on 1 MP images on current hardware. Our approach achieves a mean IoU of 61.2% for the semantic segmentation task on the challenging Cityscapes benchmark. It also achieves an average precision of 69.3% for cars and 67.7% on the moderate difficulty level of the KITTI benchmark.

show abstract

“…[38] built a CNN-based architecture to jointly reason pixel-wise instance level segmentation as well as depth order from multi-scale image patches and they combined predictions into the final labeling via the MRF. [35] used a fully convolutional network (FCN) to jointly predict pixel-level semantic labels, depths and the directions to object centers from a single street scene image.…”

Section: Depth Orderingmentioning

confidence: 99%

“…Indeed, every single objective of building an amodal perception system is not new and has long been studied separately or jointly in the community. There is plenty of literature visual understanding despite the presence of occlusion [11,30,34,36], depth ordering [26,35,38] and object completion and inpainting [7,19], that together show the feasibility and practicability of building machine vision systems with such capabilities. In this work, we try to solve all of these problems in a single amodal segmentation framework.…”

Section: Introductionmentioning

confidence: 99%

Learning Semantics-aware Distance Map with Semantics Layering Network for Amodal Instance Segmentation

Zhang

Chen

Xie

et al. 2019

Proceedings of the 27th ACM International Conference on Multimedia

View full text Add to dashboard Cite

In this work, we demonstrate yet another approach to tackle the amodal segmentation problem. Specifically, we first introduce a new representation, namely a semantics-aware distance map (sem-dist map), to serve as our target for amodal segmentation instead of the commonly used masks and heatmaps. The sem-dist map is a kind of level-set representation, of which the different regions of an object are placed into different levels on the map according to their visibility. It is a natural extension of masks and heatmaps, where modal, amodal segmentation, as well as depth order information, are all well-described. Then we also introduce a novel convolutional neural network (CNN) architecture, which we refer to as semantic layering network, to estimate sem-dist maps layer by layer, from the global-level to the instance-level, for all objects in an image. Extensive experiments on the COCOA and D2SA datasets have demonstrated that our framework can predict amodal segmentation, occlusion and depth order with state-of-the-art performance.

show abstract

Pixel-Level Encoding and Depth Layering for Instance-Level Semantic Labeling

Cited by 161 publications

References 37 publications

Fast Panoptic Segmentation Network

Fast Panoptic Segmentation Network

Simultaneous Object Detection and Semantic Segmentation

Learning Semantics-aware Distance Map with Semantics Layering Network for Amodal Instance Segmentation

Contact Info

Product

Resources

About