2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022
DOI: 10.1109/cvpr52688.2022.01949
|View full text |Cite
|
Sign up to set email alerts
|

GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
33
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 53 publications
(33 citation statements)
references
References 24 publications
0
33
0
Order By: Relevance
“…In recent years, transformer-based HOI detection has gained significant attention and demonstrated promising results, primarily due to QPIC's innovative use of transformer-based techniques in HOI detection. Subsequently, several works have followed in the footsteps of QPIC, such as [4], [20], [34]. However, it is more comprehensive than one-stage approaches adopting transformer-based architectures, as demonstrated by using transformer-based techniques in the two-stage HOI detection method UPT [37].…”
Section: A Human-object Interaction Detectionmentioning
confidence: 99%
See 1 more Smart Citation
“…In recent years, transformer-based HOI detection has gained significant attention and demonstrated promising results, primarily due to QPIC's innovative use of transformer-based techniques in HOI detection. Subsequently, several works have followed in the footsteps of QPIC, such as [4], [20], [34]. However, it is more comprehensive than one-stage approaches adopting transformer-based architectures, as demonstrated by using transformer-based techniques in the two-stage HOI detection method UPT [37].…”
Section: A Human-object Interaction Detectionmentioning
confidence: 99%
“…These methods can be broadly classified into one-stage and twostage approaches. In the case of one-stage approaches [5], [32], [41], [20], [24], the task is accomplished in a single step without the need for detecting the instance first. They can simultaneously detect all pairs of instances on an image and the interaction between these instance pairs.…”
Section: Introductionmentioning
confidence: 99%
“…UPT [19] applied a unary-pairwise transformer to represent each target's instance details as unary and pairwise representations. In comparison to two-stage methods, one-stage solutions [9], [20]- [26] captured context information during the early stage of feature extraction, leading to improved HOI detection performance. The success of DETR [27] has inspired many researchers in studying HOI detection QPIC [28] applied additional detection heads and relied on a bipartite graph matching algorithm to locate HOI instances and identify interactions.…”
Section: A Human-object Interaction Detectionmentioning
confidence: 99%
“…RLIP [23] proposed a transferable HOI detector via natural language supervision. Building upon GEN-VLKT [20], HOICLIP [29] mapped image and text encodings to a joint visual-semantic space, to capture their correlations and effectively transfer knowledge from CLIP. Our work seeks to explore a more effective framework to make full use of CLIP for improving zero-shot HOI detection performance.…”
Section: Zero-shot Hoimentioning
confidence: 99%
See 1 more Smart Citation