We present a framework that enables the discovery of diverse and natural-looking motion strategies for athletic skills such as the high jump. The strategies are realized as control policies for physics-based characters. Given a task objective and an initial character configuration, the combination of physics simulation and deep reinforcement learning (DRL) provides a suitable starting point for automatic control policy training. To facilitate the learning of realistic human motions, we propose a Pose Variational Autoencoder (P-VAE) to constrain the actions to a subspace of natural poses. In contrast to motion imitation methods, a rich variety of novel strategies can naturally emerge by exploring initial character states through a sample-efficient Bayesian diversity search (BDS) algorithm. A second stage of optimization that encourages novel policies can further enrich the unique strategies discovered. Our method allows for the discovery of diverse and novel strategies for athletic jumping motions such as high jumps and obstacle jumps with no motion examples and less reward engineering than prior work.
In physics‐based character animation, Proportional‐Derivative (PD) controllers are commonly used for tracking reference motions in motor control tasks. Stable PD (SPD) controllers significantly improve the numerical stability of traditional PD controllers and support large gains and large integration time steps during simulation [TLT11]. For an articulated rigid body system with n degrees of freedom, all SPD implementations to date, however, use an O(n3) dense matrix factorization based method. In this paper, we propose a linear time algorithm for SPD computation, which is based on Featherstone's forward dynamics formulation for articulated rigid body systems in generalized coordinates [Fea14]. We demonstrate the performance advantage of our algorithm by comparing with both the conventional dense matrix factorization based method and an alternative sparse matrix factorization based method. We show that the proposed algorithm provides superior stability when controlling complex models at large time steps. We further demonstrate that our algorithm can improve the learning speed and quality of a Deep Reinforcement Learning (DRL) system for physics‐based character animation.
Physics-based character animation has seen significant advances in recent years with the adoption of Deep Reinforcement Learning (DRL). However, DRL-based learning methods are usually computationally expensive and their performance crucially depends on the choice of hyperparameters. Tuning hyperparameters for these methods often requires repetitive training of control policies, which is even more computationally prohibitive. In this work, we propose a novel Curriculum-based Multi-Fidelity Bayesian Optimization framework (CMFBO) for efficient hyperparameter optimization of DRL-based character control systems. Using curriculum-based task difficulty as fidelity criterion, our method improves searching efficiency by gradually pruning search space through evaluation on easier motor skill tasks. We evaluate our method on two physics-based character control tasks: character morphology optimization and hyperparameter tuning of DeepMimic. Our algorithm significantly outperforms state-of-the-art hyperparameter optimization methods applicable for physics-based character animation. In particular, we show that hyperparameters optimized through our algorithm result in at least 5x efficiency gain comparing to author-released settings in DeepMimic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.