Skeletal parameter estimation from optical motion capture data

Kirk, Adam G.; O’Brien, James F.; Forsyth, David

doi:10.1145/1186223.1186260

Cited by 41 publications

(48 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Since pose estimation is much better-posed in 2D than in 3D, a popular way to infer joint positions is to use a generative model to find a 3D pose whose projection aligns with the 2D image data. In the past, this usually involved inferring a 3D human pose by optimizing an energy function derived from image information, such as silhouettes [6,14,21,22,25,31,44,49,60], trajectories [74], feature descriptors [58,62,63] and 2D joint locations [2,3,5,20,36,51,57,68,69]. Another class of approaches retrieve the pose from a dictionary of 3D poses based on similarity with the 2D image evidence [18,26,39,41,42].…”

Section: Related Workmentioning

confidence: 99%

Learning to Fuse 2D and 3D Image Cues for Monocular Body Pose Estimation

Tekin

Márquez-Neila

Salzmann

et al. 2017

2017 IEEE International Conference on Computer Vision (ICCV)

221

125

View full text Add to dashboard Cite

Most recent approaches to monocular 3D human pose estimation rely on Deep Learning. They typically involve regressing from an image to either 3D joint coordinates directly or 2D joint locations from which 3D coordinates are inferred. Both approaches have their strengths and weaknesses and we therefore propose a novel architecture designed to deliver the best of both worlds by performing both simultaneously and fusing the information along the way. At the heart of our framework is a trainable fusion scheme that learns how to fuse the information optimally instead of being hand-designed. This yields significant improvements upon the state-of-the-art on standard 3D human pose estimation benchmarks.

show abstract

Section: Related Workmentioning

confidence: 99%

Learning to Fuse 2D and 3D Image Cues for Monocular Body Pose Estimation

Tekin

Márquez-Neila

Salzmann

et al. 2017

2017 IEEE International Conference on Computer Vision (ICCV)

221

125

View full text Add to dashboard Cite

show abstract

“…The skeleton data, containing the 3-D position vectors of a set of key joints in each frame, can be extracted by some low-cost RGB-D sensors [26] (Kinect, Realsense, etc.) or motion capture system [27]. On the other hand, some works can achieve similar action recognition from depth images [19], [34] capturing the point clouds of the human body and background in 3-D space.…”

Section: Extensions On Multiple Rigid Bodiesmentioning

confidence: 99%

“…geometric features of pixels which capture the point clouds of the human body and the background in 3-D space [18]. The 3-D skeleton of a human body captured by RGB-D sensors [26] or motion capture system [27] also have been intensively studied in human action representations due to the robustness to variations of viewpoint, human body scale and motion speed as well as the real-time performance [20], [21]. In this paper, we extend the RRV descriptor to multiple rigid bodies for skeleton-based human action recognition.…”

mentioning

confidence: 99%

RRV: A Spatiotemporal Descriptor for Rigid Body Motion Recognition

Guo

Shao

2018

IEEE Trans. Cybern.

View full text Add to dashboard Cite

The motion behaviors of a rigid body can be characterized by a six degrees of freedom motion trajectory, which contains the 3-D position vectors of a reference point on the rigid body and 3-D rotations of this rigid body over time. This paper devises a rotation and relative velocity (RRV) descriptor by exploring the local translational and rotational invariants of rigid body motion trajectories, which is insensitive to noise, invariant to rigid transformation and scale. The RRV descriptor is then applied to characterize motions of a human body skeleton modeled as articulated interconnections of multiple rigid bodies. To show the descriptive ability of our RRV descriptor, we explore its potentials and applications in different rigid body motion recognition tasks. The experimental results on benchmark datasets demonstrate that our RRV descriptor learning discriminative motion patterns can achieve superior results for various recognition tasks.

show abstract

“…The information from the markers can be useful to obtain the kinematic characteristics of human. In [4] an algorithm for automatically estimating a subject's skeletal structure from optical motion capture data is presented. This algorithm defined the cluster of markers into segment groups, determines the topological connectivity between these groups and locates the positions of their connecting joints.…”

Section: A Human To Humanoid Motion -State Of the Artmentioning

confidence: 99%