2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00374
|View full text |Cite
|
Sign up to set email alerts
|

See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks

Abstract: We introduce a novel network, called CO-attention Siamese Network (COSNet), to address the unsupervised video object segmentation task from a holistic view. We emphasize the importance of inherent correlation among video frames and incorporate a global co-attention mechanism to improve further the state-of-the-art deep learning based solutions that primarily focus on learning discriminative foreground representations over appearance and motion in short-term temporal segments. The co-attention layers in our net… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
226
0
2

Year Published

2019
2019
2020
2020

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 456 publications
(244 citation statements)
references
References 62 publications
1
226
0
2
Order By: Relevance
“…The so-called attention is a way to observe the world and imitate the mechanism of humans [35]. Recently, it has been demonstrated to be a simple but effective tool for improving the representation ability of CNN through reweighting of the feature maps; this is achieved using spatial attention and channel attention to scale the features which are meaningful or useless [36][37][38][39][40][41][42][43].…”
Section: Attention Mechanismmentioning
confidence: 99%
See 1 more Smart Citation
“…The so-called attention is a way to observe the world and imitate the mechanism of humans [35]. Recently, it has been demonstrated to be a simple but effective tool for improving the representation ability of CNN through reweighting of the feature maps; this is achieved using spatial attention and channel attention to scale the features which are meaningful or useless [36][37][38][39][40][41][42][43].…”
Section: Attention Mechanismmentioning
confidence: 99%
“…More recently, in order to understand the fine-grained relationship and mine the underlying correlations between different modalities, co-attention mechanisms have been widely studied in vision-and-language tasks, such as in visual question answering (VQA) [46][47][48][49]. In the computer vision area, Lu was inspired by the above-mentioned works and built the co-attention module to capture the coherence between video frames and effectively suppress the current alternatives [40].…”
Section: Attention Mechanismmentioning
confidence: 99%
“…In the area of segmentation, semantic segmentation and panoptic segmentation [43][44][45][46] use the attention mechanism to guide the feed-forward network for segmenting more accurately. Especially, the attention mechanism in video object segmentation helps to focus on target objects and overlook confusing background [41,47,48]. As for the attention mechanism itself, there exist different variants: hierarchical attention [49], self-attention [50], and coattention [51].…”
Section: Attention Mechanismmentioning
confidence: 99%
“…tasks [20], [21], [22], [23], including visual tracking [24], [25], have been shown to benefit from powerful deep discriminative features [26], [27], [28], [29]. The top-ranked trackers in recent competitions, e.g., OTB100 [8], VOT2017 [30], VOT2018 [31], are all based on deep neural network features.…”
Section: Introductionmentioning
confidence: 99%