2020
DOI: 10.1016/j.neucom.2020.01.078
|View full text |Cite
|
Sign up to set email alerts
|

Zero-shot learning for action recognition using synthesized features

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 38 publications
(14 citation statements)
references
References 3 publications
0
14
0
Order By: Relevance
“…The former assumes that only the labeled videos from the seen categories are available during training while the latter can use the unlabeled data of the unseen categories for model training. Specifically, in this work, we focus on inductive ZSAR [12], [15], [26], [42] and do not discuss the transductive approach [9], [32].…”
Section: Methodsmentioning
confidence: 99%
“…The former assumes that only the labeled videos from the seen categories are available during training while the latter can use the unlabeled data of the unseen categories for model training. Specifically, in this work, we focus on inductive ZSAR [12], [15], [26], [42] and do not discuss the transductive approach [9], [32].…”
Section: Methodsmentioning
confidence: 99%
“…Specifically, in the former line of research, only a few training samples are available from each action category, [82,83] proposed compound memory networks to classify videos by matching and ranking; [11] used GANs to synthesize training examples for novel categories; [6] proposed differentiable dynamic time warping to align videos of different lengths; [54] exploited CrossTransformer, to find temporally-corresponding frame tuples between the query and given few-shot videos. While in openset action recognition, it requires the model to generalise towards action categories that are unseen in the training set, one typical idea lies in learning a common representation space that is shared by seen and unseen actions, such as attributes space [19,42], semantic space [20,36], synthesizing features to unseen actions [49], using objects to create common space for unseen actions [46].…”
Section: Related Workmentioning
confidence: 99%
“…Li et al (2016) and Tian et al (2018) map features from videos to a semantic space shared by seen and unseen actions, while Gan et al ((2016c)) train a classifier for unseen actions by performing several levels of relatedness to seen actions. Other works propose to synthesize features for unseen actions (Mishra et al 2018(Mishra et al , 2020, learn a universal representation of actions (Zhu et al 2018), or differentiate seen from unseen actions through out-of-distribution detection (Mandal et al 2019). All these works eliminate the need for attributes for unseen action classification.…”
Section: Unseen Action Classificationmentioning
confidence: 99%