2023
DOI: 10.48550/arxiv.2303.11003
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Tubelet-Contrastive Self-Supervision for Video-Efficient Generalization

Abstract: We propose a self-supervised method for learning motion-focused video representations. Existing approaches minimize distances between temporally augmented videos, which maintain high spatial similarity. We instead propose to learn similarities between videos with identical local motion dynamics but an otherwise different appearance. We do so by adding synthetic motion trajectories to videos which we refer to as tubelets. By simulating different tubelet motions and applying transformations, such as scaling and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 72 publications
(135 reference statements)
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?