2019 IEEE/CVF International Conference on Computer Vision (ICCV) 2019
DOI: 10.1109/iccv.2019.00933
|View full text |Cite
|
Sign up to set email alerts
|

Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks

Abstract: This work proposes a novel attentive graph neural network (AGNN) for zero-shot video object segmentation (ZVOS). The suggested AGNN recasts this task as a process of iterative information fusion over video graphs. Specifically, AGNN builds a fully connected graph to efficiently represent frames as nodes, and relations between arbitrary frame pairs as edges. The underlying pair-wise relations are described by a differentiable attention mechanism. Through parametric message passing, AGNN is able to efficiently c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
130
0
1

Year Published

2020
2020
2020
2020

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 243 publications
(131 citation statements)
references
References 64 publications
(136 reference statements)
0
130
0
1
Order By: Relevance
“…Now, we are working on an adaptation of the proposed segmentation method into 3D [18,27]. We are also considering assessing the method on biomedical images and more experimental comparisons with very popular, fully convolutional approaches [29,54]. Finally, we plan to tackle a more complex task of semantic segmentation.…”
Section: Discussionmentioning
confidence: 99%
“…Now, we are working on an adaptation of the proposed segmentation method into 3D [18,27]. We are also considering assessing the method on biomedical images and more experimental comparisons with very popular, fully convolutional approaches [29,54]. Finally, we plan to tackle a more complex task of semantic segmentation.…”
Section: Discussionmentioning
confidence: 99%
“…With their popularity in the field of natural language processing [8,39,43,49,60], attention modeling is rapidly adopted in various computer vision tasks, such as image recognition [14,23,58,66,73], domain adaptation [67,83], human pose estimation [9,63,77], object detection [4] and image generation [76,81,86]. Further, co-attention mechanisms become an essential tool in many vision-language applications and sequential modeling tasks, such as visual question answering [41,44,75,78], visual dialog [74,84], vision-language navigation [68], and video segmentation [42,61], showing its effectiveness in capturing the underlying relations between different entities. Inspired by the general idea of attention mechanisms, this work leverages co-attention to mine semantic relations within training image pairs, which helps the classifier network learn complete object patterns and generate precise object localization maps.…”
Section: Related Workmentioning
confidence: 99%
“…Pixels that are similar in properties combine together. The selection standard is referred to as the threshold value (Kaur & Kaur, 2014); Nishanth & Karthik, 2015; Lu, Ma, Ni, & Yang, 2019; Wang, Lu, Shen, Crandall, & Shao, 2019). Thresholding is a simple method of segmentation.…”
Section: Automatic Detection Of Exudates From Retinal Imagesmentioning
confidence: 99%