2022
DOI: 10.48550/arxiv.2203.14957
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Frame-wise Action Representations for Long Videos via Sequence Contrastive Learning

Abstract: Prior works on action representation learning mainly focus on designing various architectures to extract the global representations for short video clips. In contrast, many practical applications such as video alignment have strong demand for learning dense representations for long videos. In this paper, we introduce a novel contrastive action representation learning (CARL) framework to learn frame-wise action representations, especially for long videos, in a selfsupervised manner. Concretely, we introduce a s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 35 publications
0
1
0
Order By: Relevance
“…However, these advancements in architecture-types have not addressed the issue of learning viewpointagnostic representation. Viewpoint-agnostic representation learning is drawing increasing attention in the vision community due to its wide range of downstream applications like 3D objectdetection [41],video alignment [6,16,17], action recognition [47,48], pose estimation [22,50], robot learning [24,26,43,45,49], and other tasks.…”
Section: Related Workmentioning
confidence: 99%
“…However, these advancements in architecture-types have not addressed the issue of learning viewpointagnostic representation. Viewpoint-agnostic representation learning is drawing increasing attention in the vision community due to its wide range of downstream applications like 3D objectdetection [41],video alignment [6,16,17], action recognition [47,48], pose estimation [22,50], robot learning [24,26,43,45,49], and other tasks.…”
Section: Related Workmentioning
confidence: 99%