2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.01932
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid Relation Guided Set Matching for Few-shot Action Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
68
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 68 publications
(74 citation statements)
references
References 36 publications
0
68
0
Order By: Relevance
“…As indicated in Table 1 and Table 2, the proposed HyRSM++ surpasses other advanced approaches significantly and is able to achieve new stateof-the-art performance. For instance, HyRSM++ improves the state-of-the-art performance from 49.2% to 55.0% under the 1-shot setting on SSv2-Full and consistently outperforms our original conference version [91]. Specially, extensively compared with current strict temporal alignment techniques [7,106] and complex fusion methods [48,68], HyRSM++ produces results that are superior to them under most different shots, which implies that our approach is considerably flexible and efficient.…”
Section: Comparison With State-of-the-artmentioning
confidence: 75%
See 1 more Smart Citation
“…As indicated in Table 1 and Table 2, the proposed HyRSM++ surpasses other advanced approaches significantly and is able to achieve new stateof-the-art performance. For instance, HyRSM++ improves the state-of-the-art performance from 49.2% to 55.0% under the 1-shot setting on SSv2-Full and consistently outperforms our original conference version [91]. Specially, extensively compared with current strict temporal alignment techniques [7,106] and complex fusion methods [48,68], HyRSM++ produces results that are superior to them under most different shots, which implies that our approach is considerably flexible and efficient.…”
Section: Comparison With State-of-the-artmentioning
confidence: 75%
“…In this paper, we have extended our preliminary CVPR-2022 conference version [91] in the following aspects. i) We integrate the temporal coherence regularization and set matching strategy into a temporal set matching metric so that the proposed metric can explicitly leverage temporal order information in videos and match flexibly.…”
Section: Introductionmentioning
confidence: 99%
“…We compare SSA 2 lign with state-of-the-art FSDA approaches, and prevailing UDA/VUDA and few-shot action recognition (FSAR) approaches. These methods include: FADA [26], d-SNE [54] designed for image-based FSDA; DANN [10], MK-MMD [24], MDD [67], SAVA [8] and ACAN [56], designed for UDA/VUDA; and TRX [30], STRM [41], and HyRSM [50] proposed for FSAR. To adapt the FSAR approaches for FSVDA, the source domain is used for meta-training and the target domain is used for the meta-testing, while target labels are available for optimizing the cross-entropy loss to adapt UDA/VUDA approaches for FSVDA.…”
Section: Overall Results and Comparisonsmentioning
confidence: 99%
“…Some methods [84,72,82,83] adopt the idea of global matching in the field of few-shot image classification [50,52] to carry out few-shot matching, which results in relatively poor performance because long-term temporal alignment information is ignored in the measurement process. To exploit the temporal cues, the following approaches [3,76,42,29,64,53,61,38,19,62,77] focuses on local frame-level (or segment-level) alignment between query and support videos. Among them, OTAM [3] proposes a variant of the dynamic time warping technique [37] to explicitly utilize the temporal ordering information in support-query video pairs.…”
Section: Related Workmentioning
confidence: 99%
“…Despite this, modern models require massive data annotation, which may be time-consuming and laborious to collect. Few-shot action recognition is a promising direction to alleviate the data labeling problem, which aims to identify unseen classes with a few labeled videos and has received considerable attention [82,3,61].…”
Section: Introductionmentioning
confidence: 99%