Discriminative Human Full-Body Pose Estimation from Wearable Inertial Sensor Data

Schwarz, Loren; Mateus, Diana; Navab, Nassir

doi:10.1007/978-3-642-10470-1_14

Cited by 24 publications

(19 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Following similar ideas, in [LWC * 11] they regress to full pose using online local models but using 6 IMUs to query the database. In [SMN09] they directly regress full pose using only 4 IMUs with Gaussian Process regression. Both methods report very good results when the test motions are present in the database.…”

Section: Database Retrieval and Learning Based Methodsmentioning

confidence: 99%

Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

Marcard

Rosenhahn

Black

et al. 2017

Computer Graphics Forum

232

189

View full text Add to dashboard Cite

Figure 1: Unconstrained motion capture using our new Sparse Inertial Poser (SIP). With as few as 6 IMUs attached to the body, we recover the full pose of the subject. The key idea that makes this possible is to optimise all the poses of a statistical body model for all the frames in the sequence jointly to fit the orientation and acceleration measurements captured by the IMUs. Images are shown for reference but are not used during the optimisation. AbstractWe address the problem of making human motion capture in the wild more practical by using a small set of inertial sensors attached to the body. Since the problem is heavily under-constrained, previous methods either use a large number of sensors, which is intrusive, or they require additional video input. We take a different approach and constrain the problem by: (i) making use of a realistic statistical body model that includes anthropometric constraints and (ii) using a joint optimization framework to fit the model to orientation and acceleration measurements over multiple frames. The resulting tracker Sparse Inertial Poser (SIP) enables 3D human pose estimation using only 6 sensors (attached to the wrists, lower legs, back and head) and works for arbitrary human motions. Experiments on the recently released TNT15 dataset show that, using the same number of sensors, SIP achieves higher accuracy than the dataset baseline without using any video data. We further demonstrate the effectiveness of SIP on newly recorded challenging motions in outdoor scenarios such as climbing or jumping over a wall.

show abstract

Section: Database Retrieval and Learning Based Methodsmentioning

confidence: 99%

Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

Marcard

Rosenhahn

Black

et al. 2017

Computer Graphics Forum

232

189

View full text Add to dashboard Cite

show abstract

“…Sparse IMUs. Learning methods using sparse IMUs as input have also been proposed [Schwarz et al 2009], where full pose is regressed using Gaussian Processes. The models are trained on specific movements of individual users for each activity of interest, which greatly limits its applicability.…”

Section: Learning Based Methodsmentioning

confidence: 99%

Deep inertial poser

et al. 2018

View full text Add to dashboard Cite

show abstract

“…Acceleration data is however very noisy and the search space of possible accelerations is under constrained making the learning a very difficult task. While (Schwarz et al 2009) directly regresses full pose using only 4 IMUs with a Gaussian Process regression, with good results when the test motions are present in the database. Similarly Pons-Moll et al (2011) uses a particle filter framework to optimise the orientation constrained by IMU samples taken from a manifold of poses, to solve for outdoor sequences.…”

Section: Related Workmentioning

confidence: 99%

Fusing Visual and Inertial Sensors with Semantics for 3D Human Pose Estimation

et al. 2018

View full text Add to dashboard Cite

We propose an approach to accurately estimate 3D human pose by fusing multi-viewpoint video (MVV) with inertial measurement unit (IMU) sensor data, without optical markers, a complex hardware setup or a full body model. Uniquely we use a multi-channel 3D convolutional neural network to learn a pose embedding from visual occupancy and semantic 2D pose estimates from the MVV in a discretised volumetric probabilistic visual hull. The learnt pose stream is concurrently processed with a forward kinematic solve of the IMU data and a temporal model (LSTM) exploits the rich spatial and temporal long range dependencies among the solved joints, the two streams are then fused in a final fully connected layer. The two complementary data sources allow for ambiguities to be resolved within each sensor modality, yielding improved accuracy over prior methods. Extensive evaluation is performed with state of the art performance reported on the popular Human 3.6M dataset (Ionescu et al. in Intell IEEE Trans Pattern Anal Mach 36(7):1325-1339, 2014), the newly released TotalCapture dataset and a challenging set of outdoor videos TotalCaptureOutdoor. We release the new hybrid MVV dataset (TotalCapture) comprising of multi-viewpoint video, IMU and accurate 3D skeletal joint ground truth derived from a commercial motion capture system.

show abstract

Discriminative Human Full-Body Pose Estimation from Wearable Inertial Sensor Data

Cited by 24 publications

References 24 publications

Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

Deep inertial poser

Fusing Visual and Inertial Sensors with Semantics for 3D Human Pose Estimation

Contact Info

Product

Resources

About