2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00142
|View full text |Cite
|
Sign up to set email alerts
|

Fast Online Object Tracking and Segmentation: A Unifying Approach

Abstract: In this paper we illustrate how to perform both visual object tracking and semi-supervised video object segmentation, in real-time, with a single simple approach. Our method, dubbed SiamMask, improves the offline training procedure of popular fully-convolutional Siamese approaches for object tracking by augmenting their loss with a binary segmentation task. Once trained, SiamMask solely relies on a single bounding box initialisation and operates online, producing class-agnostic object segmentation masks and ro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
840
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 1,280 publications
(929 citation statements)
references
References 65 publications
0
840
0
1
Order By: Relevance
“…When aiming at very high segmentation accuracy, methods generally perform online fine-tuning on the basis of this supervision [3,25,35,40,43,50,62], sometimes exploiting data-augmentation techniques [3,25] or self-supervision [62]. As online fine-tuning can take up to several minutes per video, many recently proposed methods renounce to it and instead aim at a faster online speed (e.g., [7,8,64]). These faster semi-supervised approaches come in many flavours.…”
Section: Related Workmentioning
confidence: 99%
“…When aiming at very high segmentation accuracy, methods generally perform online fine-tuning on the basis of this supervision [3,25,35,40,43,50,62], sometimes exploiting data-augmentation techniques [3,25] or self-supervision [62]. As online fine-tuning can take up to several minutes per video, many recently proposed methods renounce to it and instead aim at a faster online speed (e.g., [7,8,64]). These faster semi-supervised approaches come in many flavours.…”
Section: Related Workmentioning
confidence: 99%
“…The similarity-weighted combination of feature is used to predict the final mask. A fully convolutional Siamese network based approach (SiamMask) is proposed in [40]. It computes the depth-wise cross correlation between features of templates in the reference and the current frames.…”
Section: Related Workmentioning
confidence: 99%
“…We first train the class-agnostic binary mask proposal network on COCO. Following the strategy used in [41], we then finetune the proposal network on the combination of COCO and YouTube-VOS with learning rate 0.02, batch size 8 and number of training iteration 200, 000.…”
Section: Mask Proposal Generationmentioning
confidence: 99%
“…Moreover, a self-attention mechanism was integrated to force the network to capture the non-local features. SiamMask [42] used Siamese networks for object tracking using augmentation loss to produce a binary segmentation mask. In addition, the binary segmentation mask locates the object of interest accurately.…”
Section: Siamese-based Trackersmentioning
confidence: 99%