2021
DOI: 10.48550/arxiv.2101.11223
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation

Abstract: A key assumption of top-down human pose estimation approaches is their expectation of having a single person present in the input bounding box. This often leads to failures in crowded scenes with occlusions. We propose a novel solution to overcome the limitations of this fundamental assumption. Our Multi-Hypothesis Pose Network (MHPNet) allows for predicting multiple 2D poses within a given bounding box. We introduce a Multi-Hypothesis Attention Block (MHAB) that can adaptively modulate channel-wise feature re… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 36 publications
0
2
0
Order By: Relevance
“…Bertasius et al (2019) extend from images to videos and propose a method for learning pose warping on sparsely labeled videos. Khirodkar et al (2021) offer a Multi-Instance Pose Network (MIPNet) that predicts multiple 2D pose instances within a bounding box. It can overcome the limitations of crowded scenes with occlusions.…”
Section: Human Pose Estimationmentioning
confidence: 99%
“…Bertasius et al (2019) extend from images to videos and propose a method for learning pose warping on sparsely labeled videos. Khirodkar et al (2021) offer a Multi-Instance Pose Network (MIPNet) that predicts multiple 2D pose instances within a bounding box. It can overcome the limitations of crowded scenes with occlusions.…”
Section: Human Pose Estimationmentioning
confidence: 99%
“…In contrast, datasets like MPI-INF-3DHP [46], PanopticStudio [21] and 3DPW [61] contain multi-person annotations but have limited person-person occlusion -less than 27% of all annotations have crowding (at IoU 0.5). Although previous methods [24,29,30,32,35,36,67] leverage 2D keypoint annotations from datasets like COCO [39], MPII [1], LSP-Extended [20], the 2D datasets are also known to contain similar biases [28,53,69]. These biases have affected critical design decisions in state-of-the-art methods which lead to poor generalization under heavy occlusion [19,58].…”
Section: Related Workmentioning
confidence: 99%