2020
DOI: 10.48550/arxiv.2007.08728
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Detecting Human-Object Interactions with Action Co-occurrence Priors

Abstract: A common problem in human-object interaction (HOI) detection task is that numerous HOI classes have only a small number of labeled examples, resulting in training sets with a long-tailed distribution. The lack of positive labels can lead to low classification accuracy for these classes. Towards addressing this issue, we observe that there exist natural correlations and anti-correlations among human-object interactions. In this paper, we model the correlations as action co-occurrence matrices and present techni… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 63 publications
0
10
0
Order By: Relevance
“…In (Xu et al 2019;Gao et al 2020;Liu, Chen, and Zisserman 2020;Kim et al 2020;Bansal et al 2020), word embedding or language prior is introduced, e.g. Xu (Xu et al 2019) used GloVe (Pennington, Socher, and Manning 2014) to generate word embedding as part of graph node representation.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…In (Xu et al 2019;Gao et al 2020;Liu, Chen, and Zisserman 2020;Kim et al 2020;Bansal et al 2020), word embedding or language prior is introduced, e.g. Xu (Xu et al 2019) used GloVe (Pennington, Socher, and Manning 2014) to generate word embedding as part of graph node representation.…”
Section: Related Workmentioning
confidence: 99%
“…Human-Object Interaction (HOI) is to detect interactive relation between human and the surroundings. HOI detection plays an important role in high-level human-centric scene understanding, and has attracted considerable research interest recently (Ulutan and Iftekhar 2020; Kim et al 2020;Liao et al 2020;Kim et al 2020;Liu, Chen, and Zisserman 2020;Bansal et al 2020;Gao, Zou, and Huang 2018), resulting in significant improvement (Gao et al 2020;Hou et al 2020;Li et al 2020;Zou et al 2021).…”
Section: Introductionmentioning
confidence: 99%
“…Pose Language AProle Two-stage methods VSRL [11] ResNet-50-FPN 31.8 InteractNet [9] ResNet-50-FPN 40.0 GPNN [27] ResNet-101 44.0 RPNN [39] ResNet50 47.5 VCL [14] ResNet101 48.3 TIN * [18] ResNet-50 48.7 Zhou et al [40] ResNet-50 48.9 PastaNet [17] ResNet-50 51.0 DRG [6] ResNet-50-FPN 51.0 VSGNet [31] ResNet-152 51.8 CHG [34] ResNet-50 52.7 PMFNet [33] ResNet-50-FPN 52.0 PD-Net [38] ResNet-152 52.6 FCMNet [22] ResNet-50 53.1 ACP * [15] ResNet-152 53.2 One-stage methods UnionDet [3] ResNet-50-FPN 47.5 IPNet [35] Hourglass-104 51.0 IPNet * [35] Hourglass-104 52.3 Ours ResNet-101 52.9 We conduct experiments to evaluate the relative importance of 'object' and 'interaction' in our experiments. In Eq.…”
Section: Backbonementioning
confidence: 99%
“…Graph Neural Networks (GNNs) further improve the feature extraction process by explicitly modeling the instance-wise interactions between objects [50,33]. Due to the nature of VRD tasks, additional information has been introduced as auxiliary training signals, such as language priors [29], prior interactiveness knowledge of objects [23], and action co-occurrence knowledge [22]. In comparison with the existing methods, SABRA is the first to identify the importance of false positives in VRD tasks and has significantly outperformed SOTA methods in our experiments.…”
Section: Visual Relationship Detectionmentioning
confidence: 99%
“…Standard imbalanced learning techniques include data re-balancing [1,2,3], loss function engineering [4,16] and meta-learning [20]. VRD, as a common computer vision task, also suffers from the imbalanced problem [27,22]. Specifically, [22] considers the imbalance of relationship imbalance, i.e., the imbalance of positive samples, and uses action co-occurrence to provide additional labels.…”
Section: Learning Under Imbalanced Distributionmentioning
confidence: 99%