Occluded Video Instance Segmentation: A Benchmark

Qi, Jiyang; Gao, Yan; Hu, Yao; Wang, Xinggang; Liu, Xiaoyu; Bai, Xiang; Belongie, Serge; Yuille, Alan; Torr, Philip H. S.; Bai, Song

doi:10.1007/s11263-022-01629-1

Cited by 65 publications

(54 citation statements)

References 91 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On the other hand, video data in the real world have additional temporal information compared to static image data, and the data form in the testing stage is video, so it is more straightforward to bring more video segmentation datasets to improve performance. Benefit from the release of several new datasets in the video segmentation field recently, such as YoutubeVIS which has more objects in each video, OVIS [18] which occlusion scenarios are significant, and VSPW [12]) which have dense annotations and high-quality resolution, we introduce them into the second training stage, thus significantly improving the performance of models.…”

Section: Data Mattersmentioning

confidence: 99%

“…In the pre-training stage, several static image datasets including COCO [9], ECSSD [19], MSRA10K [4], PASCAL-S [7], PASCAL-VOC [6] are used for preliminarily semantic learning. During the main training, video datasets including Youtube-VOS [23], DAVIS 2017 [17], YouTubeVIS [8], OVIS [18], and VSPW [12] are used to enhance the generalization and robustness of the model.…”

Section: Training Detailsmentioning

confidence: 99%

“…Inspired by these existing methods, we propose a simple yet effective solution for VOS, as shown in Figure 2. In order to deal with existing difficulties, we first analyze the Youtube-VOS dataset and other video segmentation related datasets (e.g., OVIS [18] and VSPW [12]). We find that other datasets can supplement the diversified scenes with similar objects and occlusion situation from the data aspect.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation

Yang¹,

Su²,

Duan³

et al. 2022

Preprint

View full text Add to dashboard Cite

Video object segmentation (VOS) has made significant progress with the rise of deep learning. However, there still exist some thorny problems, for example, similar objects are easily confused and tiny objects are difficult to be found. To solve these problems and further improve the performance of VOS, we propose a simple yet effective solution for this task. In the solution, we first analyze the distribution of the Youtube-VOS dataset and supplement the dataset by introducing public static and video segmentation datasets. Then, we improve three network architectures with different characteristics and train several networks to learn the different characteristics of objects in videos. After that, we use a simple way to integrate all results to ensure that different models complement each other. Finally, subtle post-processing is carried out to ensure accurate video object segmentation with precise boundaries. Extensive experiments on Youtube-VOS dataset show that the proposed solution achieves the state-of-the-art performance with an 86.1% overall score on the YouTube-VOS 2022 test set, which is 5th place on the video object segmentation track of the Youtube-VOS Challenge 2022.

show abstract

Section: Data Mattersmentioning

confidence: 99%

Section: Training Detailsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation

Yang¹,

Su²,

Duan³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Amodal perception is of interest in many application fields of computer vision. Hence, datasets for amodal perception can be found in different fields, e.g., instance [17] and video instance segmentation [19], [20], human recognition and deocclusion [21]. The OVIS dataset [19] provides instance masks for videos while additionally labeling the occlusion level of each instance.…”

Section: A Datasets For Amodal Perceptionmentioning

confidence: 99%

“…Hence, datasets for amodal perception can be found in different fields, e.g., instance [17] and video instance segmentation [19], [20], human recognition and deocclusion [21]. The OVIS dataset [19] provides instance masks for videos while additionally labeling the occlusion level of each instance. SAIL-VOS [20] is a synthetic video instance segmentation dataset with amodal instance segmentation masks.…”

Section: A Datasets For Amodal Perceptionmentioning

confidence: 99%

Amodal Cityscapes: A New Dataset, its Generation, and an Amodal Semantic Segmentation Challenge Baseline

Breitenstein¹,

Fingscheidt²

2022

Preprint

View full text Add to dashboard Cite

Amodal perception terms the ability of humans to imagine the entire shapes of occluded objects. This gives humans an advantage to keep track of everything that is going on, especially in crowded situations. Typical perception functions, however, lack amodal perception abilities and are therefore at a disadvantage in situations with occlusions. Complex urban driving scenarios often experience many different types of occlusions and, therefore, amodal perception for automated vehicles is an important task to investigate. In this paper, we consider the task of amodal semantic segmentation and propose a generic way to generate datasets to train amodal semantic segmentation methods. We use this approach to generate an amodal Cityscapes dataset. Moreover, we propose and evaluate a method as baseline on Amodal Cityscapes, showing its applicability for amodal semantic segmentation in automotive environment perception. We provide the means to re-generate this dataset on github 1 .

show abstract

In Defense of Online Models for Video Instance Segmentation

Liu

Jiang³

et al. 2022

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Occluded Video Instance Segmentation: A Benchmark

Cited by 65 publications

References 91 publications

5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation

5th Place Solution for YouTube-VOS Challenge 2022: Video Object Segmentation

Amodal Cityscapes: A New Dataset, its Generation, and an Amodal Semantic Segmentation Challenge Baseline

In Defense of Online Models for Video Instance Segmentation

Contact Info

Product

Resources

About