2022
DOI: 10.1007/978-3-031-19812-0_22
|View full text |Cite
|
Sign up to set email alerts
|

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 43 publications
0
6
0
Order By: Relevance
“…Open-vocabulary Visual Relationship Detection. The task of visual relationship detection in images (Lu et al 2016) or videos (Shang et al 2017), involving the classification and localization of relationship triplets, has become a hot topic in the field of computer vision (Tang et al 2020;Li et al 2022b;Cong, Yang, and Rosenhahn 2023;Zheng, Chen, and Jin 2022;Xu et al 2022;Chen, Xiao, and Chen 2023). This field has also explored the concept of zero-shot detection (Shang et al 2021), where all object and relationship categories are seen during training, but some certain triplet combinations remain unseen during test.…”
Section: Related Workmentioning
confidence: 99%
“…Open-vocabulary Visual Relationship Detection. The task of visual relationship detection in images (Lu et al 2016) or videos (Shang et al 2017), involving the classification and localization of relationship triplets, has become a hot topic in the field of computer vision (Tang et al 2020;Li et al 2022b;Cong, Yang, and Rosenhahn 2023;Zheng, Chen, and Jin 2022;Xu et al 2022;Chen, Xiao, and Chen 2023). This field has also explored the concept of zero-shot detection (Shang et al 2021), where all object and relationship categories are seen during training, but some certain triplet combinations remain unseen during test.…”
Section: Related Workmentioning
confidence: 99%
“…Gao et al [12] first proposed a compositional and motionbased relation prompt learning framework (RePro) in openvocabulary VidVRD setting. Albeit with these prior arts, only a few work has realized the long-tail predicate distribution as the bottleneck issue for VidSGG task [20,45].…”
Section: Video-based Scene Graph Generationmentioning
confidence: 99%
“…Li et al [20] proposed a causality-inspired interaction to weaken the false correlation between input data and predicate la-bels. Xu et al [45] considered temporal, spatial, and object biases in a meta-learning paradigm. These implicit approaches mitigate the long-tail problem to some extent, but the performance of tail classes is still unsatisfactory.…”
Section: Introductionmentioning
confidence: 99%
“…MAML [9], a popular meta-learning method, was originally designed to learn a good weight initialization that can quickly adapt to new tasks in testing, which showed promise in few-shot learning. Subsequently, its extension [28], which requires no model updating on the unseen testing scenarios, has been applied beyond few-shot learning, to enhance model performance [13,2,16,46]. Differently, we propose a novel framework via meta-learning to perform more reliable confidence estimation.…”
Section: Related Workmentioning
confidence: 99%
“…Meta-learning, also known as "learning to learn", allows us to train a model that can generalize well to different distributions. Specifically, in some metalearning works [9,28,13,2,16,46], a virtual testing set is used to mimic the testing conditions during training, so that even though training is mainly done on a virtual training set consisting of training data, performance on the testing scenario is improved. In our work, we construct our virtual testing sets such that they simulate various distributions that are different from the virtual training set, which will push our model to learn distribution-generalizable knowledge to perform well on diverse distributions, instead of learning distribution-specific knowledge that only performs well on the training distribution.…”
Section: Introductionmentioning
confidence: 99%