2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00408
|View full text |Cite
|
Sign up to set email alerts
|

RANet: Ranking Attention Network for Fast Video Object Segmentation

Abstract: Despite online learning (OL) techniques have boosted the performance of semi-supervised video object segmentation (VOS) methods, the huge time costs of OL greatly restrict their practicality. Matching based and propagation based methods run at a faster speed by avoiding OL techniques. However, they are limited by sub-optimal accuracy, due to mismatching and drifting problems. In this paper, we develop a real-time yet very accurate Ranking Attention Network (RANet) for VOS. Specifically, to integrate the insigh… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
101
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 210 publications
(101 citation statements)
references
References 54 publications
0
101
0
Order By: Relevance
“…It has received widespread attention recently and been widely used in many vision and image processing related areas, such as contentaware image editing [6], object recognition [42], photosynth [4], non-photo-realist rendering [41], weakly supervised semantic segmentation [19] and image retrieval [15]. Besides, there are many works focusing on video salient object detection [12,54] and RGB-D salient object detection [11,66].…”
Section: Introductionmentioning
confidence: 99%
“…It has received widespread attention recently and been widely used in many vision and image processing related areas, such as contentaware image editing [6], object recognition [42], photosynth [4], non-photo-realist rendering [41], weakly supervised semantic segmentation [19] and image retrieval [15]. Besides, there are many works focusing on video salient object detection [12,54] and RGB-D salient object detection [11,66].…”
Section: Introductionmentioning
confidence: 99%
“…However, this strategy easily leads to overfitting to the initial target appearance and impractically long run-times. More recent methods [34,13,32,23,36,24,17] therefore integrate target-specific appearance models into the segmentation architecture. In addition to improved run-times, many of these methods can also benefit from full end-to-end learning, which has been shown to have a crucial impact on performance [32,14,24].…”
Section: Related Workmentioning
confidence: 99%
“…This information is then sent to the segmentation network to predict the target mask. The method [36] predicts template correlation filters given the input target mask. Target classification is then performed by applying the correlation filters on the the test frame.…”
Section: Related Workmentioning
confidence: 99%
“…Video Object Segmentation (VOS) (Perazzi et al 2016) is related to VSOD, and it mainly includes Unsupervised VOS (UVOS) (Song et al 2018) and semi-supervised VOS (Wang et al 2019b). Semi-supervised VOS aims to segment specific objects which are assigned by the first frame, while UVOS predicts masks for primary objects in a video, with no other hints such as reference masks.…”
Section: Video Object Segmentationmentioning
confidence: 99%