2021
DOI: 10.48550/arxiv.2102.02751
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Semi-Supervised Action Recognition with Temporal Contrastive Learning

Abstract: Learning to recognize actions from only a handful of labeled videos is a challenging problem due to the scarcity of tediously collected activity labels. We approach this problem by learning a two-pathway temporal contrastive model using unlabeled videos at two different speeds leveraging the fact that changing video speed does not change an action. Specifically, we propose to maximize the similarity between encoded representations of the same video at two different speeds as well as minimize the similarity bet… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 39 publications
0
5
0
Order By: Relevance
“…In our CMPL framework, this is done by feeding the primary backbone F (•) and the auxiliary network A(•) clips from different temporal locations of the same video while still requiring them to supervise each other. Meanwhile, we also follow [22,40] in regarding different frame rates as a form of temporal augmentation. This is also illustrated in Fig.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…In our CMPL framework, this is done by feeding the primary backbone F (•) and the auxiliary network A(•) clips from different temporal locations of the same video while still requiring them to supervise each other. Meanwhile, we also follow [22,40] in regarding different frame rates as a form of temporal augmentation. This is also illustrated in Fig.…”
Section: Methodsmentioning
confidence: 99%
“…[13] introduces a new framework that leverages a 2D image classifier to assist action recognition. [22] proposes a temporal contrastive learning framework to model temporal aspects by comparing the same video at different speeds.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Recently, Jing et al [27] and Singh et al [47] propose to adapt the SSL framework to the video domain. They focus on algorithmic improvement for video SSL.…”
Section: Data Augmentationmentioning
confidence: 99%