2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00906
|View full text |Cite
|
Sign up to set email alerts
|

Improving Semantic Segmentation via Video Propagation and Label Relaxation

Abstract: Semantic segmentation requires large amounts of pixelwise annotations to learn accurate models. In this paper, we present a video prediction-based methodology to scale up training sets by synthesizing new training samples in order to improve the accuracy of semantic segmentation networks. We exploit video prediction models' ability to predict future frames in order to also predict future labels. A joint propagation strategy is also proposed to alleviate mis-alignments in synthesized samples. We demonstrate tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
296
2

Year Published

2020
2020
2022
2022

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 362 publications
(300 citation statements)
references
References 42 publications
(62 reference statements)
2
296
2
Order By: Relevance
“…Xu et al [16] applied different segmentation strategies to various regions of the input image, which exploited optical flow to preserve the semantics in static regions. Zhu et al [17] investigated the generation of future semantic segmentation labels from current manual labels by video prediction based on motion vector estimation.…”
Section: B Semantics Sharingmentioning
confidence: 99%
“…Xu et al [16] applied different segmentation strategies to various regions of the input image, which exploited optical flow to preserve the semantics in static regions. Zhu et al [17] investigated the generation of future semantic segmentation labels from current manual labels by video prediction based on motion vector estimation.…”
Section: B Semantics Sharingmentioning
confidence: 99%
“…A second approach to handle the problem of implausible prediction outputs that lack realism is to reduce the complexity of the problem. Many authors, for example, used data with lower-dimensional image content, such as label images, instead of natural image scenes [12,15,[22][23][24][25]. Others split the problem into two problems, motion and content prediction, and learn separate representations for the static and dynamic components.…”
Section: Related Workmentioning
confidence: 99%
“…With an evaluation time of one second, this algorithm is too slow for robotic applications or automated driving. (Zhu et al, 2019) show a video-based approach to further improve the segmentation process by propagating labels between two frames jointly. In contrast to that, (Chen et al, 2018) show a network capable of close to real-time semantic segmentation.…”
Section: Related Workmentioning
confidence: 99%