2020 IEEE Winter Conference on Applications of Computer Vision (WACV) 2020
DOI: 10.1109/wacv45572.2020.9093535
|View full text |Cite
|
Sign up to set email alerts
|

Action Segmentation with Mixed Temporal Domain Adaptation

Abstract: The main progress for action segmentation comes from densely-annotated data for fully-supervised learning. Since manual annotation for frame-level actions is timeconsuming and challenging, we propose to exploit auxiliary unlabeled videos, which are much easier to obtain, by shaping this problem as a domain adaptation (DA) problem. Although various DA techniques have been proposed in recent years, most of them have been developed only for the spatial direction. Therefore, we propose Mixed Temporal Domain Adapta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 22 publications
(11 citation statements)
references
References 37 publications
0
11
0
Order By: Relevance
“…One category of works focuses on the specific action recognition task that aims to classify a video clip into a particular category of human actions via temporal alignment [8], temporal attention [49,13], or self-supervised video representation learning [46,13]. Another category of works focus on action segmentation that simultaneously segments a video in time and classifies each segmented video clip with an action class via temporal alignment [9] or self-supervised video representation learning [10].…”
Section: Domain Adaptive Video Classificationmentioning
confidence: 99%
“…One category of works focuses on the specific action recognition task that aims to classify a video clip into a particular category of human actions via temporal alignment [8], temporal attention [49,13], or self-supervised video representation learning [46,13]. Another category of works focus on action segmentation that simultaneously segments a video in time and classifies each segmented video clip with an action class via temporal alignment [9] or self-supervised video representation learning [10].…”
Section: Domain Adaptive Video Classificationmentioning
confidence: 99%
“…Domain Adaptation for Videos. Prior works for video domain adaptation (DA) have focused on classification [6,11,28,42], segmentation [7,8] and localisation [2]. They use adversarial training to align the marginal distributions [28], an auxiliary self-supervised task [8,11,42], or attending to relevant frames alignment [6][7][8].…”
Section: Related Workmentioning
confidence: 99%
“…However, not all the frame-level features contribute the same to the overall domain discrepancy. Therefore, inspired by [4,5], we assign larger attention weights to the features which have larger domain discrepancy so that we can focus more on aligning those features, achieving more effective domain adaptation.…”
Section: Technical Approach Detailsmentioning
confidence: 99%
“…Local SSTDA is necessary to calculate the attention weights for DATP. Without this mechanisms, frames will be aggregated in the same way as temporal pooling without cross-domain consideration, which is already demonstrated sub-optimal for cross-domain video tasks [4,5]. Domain Attentive Entropy (DAE): Minimum entropy regularization is a common strategy to perform more refined classifier adaptation.…”
Section: Technical Approach Detailsmentioning
confidence: 99%