2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00932
|View full text |Cite
|
Sign up to set email alerts
|

Video Object Segmentation Using Space-Time Memory Networks

Abstract: We propose a novel solution for semi-supervised video object segmentation. By the nature of the problem, available cues (e.g. video frame(s) with object masks) become richer with the intermediate predictions. However, the existing methods are unable to fully exploit this rich source of information. We resolve the issue by leveraging memory networks and learn to read relevant information from all available sources. In our framework, the past frames with object masks form an external memory, and the current fram… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
417
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 531 publications
(453 citation statements)
references
References 38 publications
(136 reference statements)
1
417
0
Order By: Relevance
“…However, this strategy easily leads to overfitting to the initial target appearance and impractically long run-times. More recent methods [34,13,32,23,36,24,17] therefore integrate target-specific appearance models into the segmentation architecture. In addition to improved run-times, many of these methods can also benefit from full end-to-end learning, which has been shown to have a crucial impact on performance [32,14,24].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…However, this strategy easily leads to overfitting to the initial target appearance and impractically long run-times. More recent methods [34,13,32,23,36,24,17] therefore integrate target-specific appearance models into the segmentation architecture. In addition to improved run-times, many of these methods can also benefit from full end-to-end learning, which has been shown to have a crucial impact on performance [32,14,24].…”
Section: Related Workmentioning
confidence: 99%
“…While most state-of-the-art VOS approaches employ similar image feature extractors and segmentation heads, the advances in how to capture and utilize target information has led to much improved performance [14,32,24,28]. A Fig.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Inspired by recent advances in related video tasks, e.g., dense connections for spatio-temporal interaction in action recognition [39] and space-time memory block in video object segmentation [40], we possibly consider these techniques to avoid the latent dependency issues in video SOD. Learning spatial-temporal features in an end-to-end manner is important for further accuracy improvement.…”
Section: Promising Future Workmentioning
confidence: 99%