MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

Jain, Arjun; Tompson, Jonathan; LeCun, Yann; Bregler, Christoph

doi:10.1007/978-3-319-16808-1_21

Cited by 114 publications

(101 citation statements)

References 60 publications

(84 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…[28][29][30] Also, there is the more challenging task of simultaneous annotation of multiple people [17,31]. In addition, there is work like that of Oliveira et al [32] that performs human part segmentation based on fully convolutional networks [23].…”

Section: Related Workmentioning

confidence: 99%

Stacked Hourglass Networks for Human Pose Estimation

Newell

Yang

Deng

2016

Lecture Notes in Computer Science

4,115

4,081

View full text Add to dashboard Cite

Abstract. This work introduces a novel convolutional network architecture for the task of human pose estimation. Features are processed across all scales and consolidated to best capture the various spatial relationships associated with the body. We show how repeated bottom-up, top-down processing used in conjunction with intermediate supervision is critical to improving the performance of the network. We refer to the architecture as a "stacked hourglass" network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions. State-of-the-art results are achieved on the FLIC and MPII benchmarks outcompeting all recent methods. Keywords: Human Pose Estimation

show abstract

Section: Related Workmentioning

confidence: 99%

Stacked Hourglass Networks for Human Pose Estimation

Newell

Yang

Deng

2016

Lecture Notes in Computer Science

4,115

4,081

View full text Add to dashboard Cite

show abstract

“…Single person pose estimation in videos has also been studied extensively in the literature [28,9,46,33,46,20,44,29,13,18]. These approaches mainly aim to improve pose estimation by utilizing temporal smoothing constraints [28,9,44,33,13] and/or optical flow information [46,20,29], but they are not directly applicable to videos with multiple potentially occluding persons.…”

Section: Related Workmentioning

confidence: 99%

“…These approaches mainly aim to improve pose estimation by utilizing temporal smoothing constraints [28,9,44,33,13] and/or optical flow information [46,20,29], but they are not directly applicable to videos with multiple potentially occluding persons.…”

Section: Related Workmentioning

confidence: 99%

“…Many applications, such as mentioned before, however, aim to analyze human body motion over time. While there exists a notable number of works that track the pose of a single person in a video [28,9,44,33,46,20,29,7,13,18], multi-person human pose estimation in unconstrained videos has not been addressed in the literature. In this work, we address the problem of tracking the poses of multiple persons in an unconstrained setting.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

PoseTrack: Joint Multi-person Pose Estimation and Tracking

Iqbal

Milan

Gall

2017

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

187

146

View full text Add to dashboard Cite

In this work, we introduce the challenging problem of joint multi-person pose estimation and tracking of an unknown number of persons in unconstrained videos. Existing methods for multi-person pose estimation in images cannot be applied directly to this problem, since it also requires to solve the problem of person association over time in addition to the pose estimation for each person. We therefore propose a novel method that jointly models multi-person pose estimation and tracking in a single formulation. To this end, we represent body joint detections in a video by a spatio-temporal graph and solve an integer linear program to partition the graph into sub-graphs that correspond to plausible body pose trajectories for each person. The proposed approach implicitly handles occlusion and truncation of persons. Since the problem has not been addressed quantitatively in the literature, we introduce a challenging "Multi-Person PoseTrack" dataset, and also propose a completely unconstrained evaluation protocol that does not make any assumptions about the scale, size, location or the number of persons. Finally, we evaluate the proposed approach and several baseline methods on our new dataset.

show abstract

“…Convolutional neural networks (CNN) [1,2] have recently demonstrated superior performance on many tasks such as image classification [3,4,5], object detection [6,7,8,9,10,11], object tracking [12,13,14], text detection [15,16], text recognition [17,18,19], local feature description [20], video classification [21,22,23], human pose estimation [24,25,26], scene recognition [27,28] and scene labelling [29,30].…”

Section: Introductionmentioning

confidence: 99%

Learning fine-grained features via a CNN Tree for Large-scale Classification

Wang

2018

Neurocomputing

View full text Add to dashboard Cite

We propose a novel approach to enhance the discriminability of Convolutional Neural Networks (CNN). The key idea is to build a tree structure that could progressively learn fine-grained features to distinguish a subset of classes, by learning features only among these classes. Such features are expected to be more discriminative, compared to features learned for all the classes. We develop a new algorithm to effectively learn the tree structure from a large number of classes. Experiments on large-scale image classification tasks demonstrate that our method could boost the performance of a given basic CNN model. Our method is quite general, hence it can potentially be used in combination with many other deep learning models.

show abstract

MoDeep: A Deep Learning Framework Using Motion Features for Human Pose Estimation

Cited by 114 publications

References 60 publications

Stacked Hourglass Networks for Human Pose Estimation

Stacked Hourglass Networks for Human Pose Estimation

PoseTrack: Joint Multi-person Pose Estimation and Tracking

Learning fine-grained features via a CNN Tree for Large-scale Classification

Contact Info

Product

Resources

About