Julian Habekost scite author profile

Julian Habekost

2Publications

0Citation Statements Received

98Citation Statements Given

How they've been cited

How they cite others

Affiliations

University of Edinburgh

Publications

Order By: Most citations

From Synthetic to One-Shot Regression of Camera-Agnostic Human Performances

Habekost

Pang

Shiratori

et al. 2022

View full text Add to dashboard Cite

Capturing accurate 3D human performances in global space from a static monocular video is an ill-posed problem. It requires solving various depth ambiguities and information about the camera's intrinsics and extrinsics. Therefore, most methods either learn on given cameras or require to know the camera's parameters. We instead show that a camera's extrinsics and intrinsics can be regressed jointly with human's position in global space, joint angles and body shape only from long sequences of 2D motion estimates. We exploit a static camera's constant parameters by training a model that can be applied to sequences with arbitrary length with only a single forward pass while allowing full bidirectional information flow. We show that full temporal information flow is especially necessary when improving consistency through an adversarial network. Our training dataset is exclusively synthetic, and no domain adaptation is used. We achieve one of the best Human3.6M joint's error performances for models that do not use the Human3.6M training data.

show abstract

Bodyformer: Semantics-guided 3D Body Gesture Synthesis with Transformer

et al. 2023

View full text Add to dashboard Cite

Automatic gesture synthesis from speech is a topic that has attracted researchers for applications in remote communication, video games and Metaverse. Learning the mapping between speech and 3D full-body gestures is difficult due to the stochastic nature of the problem and the lack of a rich cross-modal dataset that is needed for training. In this paper, we propose a novel transformer-based framework for automatic 3D body gesture synthesis from speech. To learn the stochastic nature of the body gesture during speech, we propose a variational transformer to effectively model a probabilistic distribution over gestures, which can produce diverse gestures during inference. Furthermore, we introduce a mode positional embedding layer to capture the different motion speeds in different speaking modes. To cope with the scarcity of data, we design an intra-modal pre-training scheme that can learn the complex mapping between the speech and the 3D gesture from a limited amount of data. Our system is trained with either the Trinity speech-gesture dataset or the Talking With Hands 16.2M dataset. The results show that our system can produce more realistic, appropriate, and diverse body gestures compared to existing state-of-the-art approaches.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Julian Habekost

From Synthetic to One-Shot Regression of Camera-Agnostic Human Performances

Bodyformer: Semantics-guided 3D Body Gesture Synthesis with Transformer

Contact Info

Product

Resources

About