Accelerated Policy Learning with Parallel Differentiable Simulation

Xu, Jie; Makoviychuk, Viktor; Narang, Yashraj; Ramos, Fábio; Matusik, Wojciech; Garg, Animesh; Macklin, Miles

doi:10.48550/arxiv.2204.07137

Cited by 6 publications

(9 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The proposed tasks enabled us to study manipulation problems with fluids comprehensively, and shed light on challenges of existing methods, and the potential of utilizing differentiable physics as an effective tool for trajectory optimization in the context of complex fluid manipulation. One future direction is to extend towards more realistic problem setups using visual input, and to distill policy optimized using differentiable physics into neural-network based policies (Lin et al, 2022a;Xu et al, 2023), or use gradients provided by differentiable simulation to guide policy learning (Xu et al, 2022). Furthermore, since MPM was originally proposed, researchers have been using it to simulate various realistic materials with intricate properties, such as snow, sand, mud, and granular materials.…”

Section: Discussionmentioning

confidence: 99%

“…Another approach is to implement physics simulations in a differentiable manner, usually with automatic differentiation tools (Hu et al, 2019a;de Avila Belbute-Peres et al, 2018;Xu et al, 2022;Huang et al, 2021), and the gradient information provided by these simulations has shown to be helpful in control and manipulation tasks concerning rigid and soft bodies (Xu et al, 2021;Lin et al, 2022a;Xu et al, 2022;Li et al, 2022;Wang et al, 2023). On a related note, Taichi (Hu et al, 2019b) and Nvidia Warp (Macklin, 2022) are two recently proposed domain-specific programming language for GPU-accelerated simulation.…”

Section: Related Workmentioning

confidence: 99%

“…We evaluate state-of-the-art Reinforcement Learning algorithms and trajectory optimization methods in FluidLab, and highlight major challenges with existing methods for solving the proposed tasks. In addition, since differentiable physics has proved useful in many rigid-body and soft-body robotic tasks (Xu et al, 2021;Lin et al, 2022a;Xu et al, 2022;Li et al, 2022), it is desirable to extend it to fluid manipulation tasks. However, supporting such a wide spectrum of materials present in FluidLab in a differentiable way is extremely challenging, and optimization using the gradients is also difficult due to the highly non-smooth optimization landscapes of the proposed tasks.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation

Zhou¹,

Zhu²,

Xu³

et al. 2023

Preprint

View full text Add to dashboard Cite

Humans manipulate various kinds of fluids in their everyday life: creating latte art, scooping floating objects from water, rolling an ice cream cone, etc. Using robots to augment or replace human labors in these daily settings remain as a challenging task due to the multifaceted complexities of fluids. Previous research in robotic fluid manipulation mostly consider fluids governed by an ideal, Newtonian model in simple task settings (e.g., pouring water into a container). However, the vast majority of real-world fluid systems manifest their complexities in terms of the fluid's complex material behaviors (e.g., elastoplastic deformation) and multi-component interactions (e.g. coffee and frothed milk when making latte art), both of which were well beyond the scope of the current literature. To evaluate robot learning algorithms on understanding and interacting with such complex fluid systems, a comprehensive virtual platform with versatile simulation capabilities and well-established tasks is needed. In this work, we introduce FluidLab, a simulation environment with a diverse set of manipulation tasks involving complex fluid dynamics. These tasks address interactions between solid and fluid as well as among multiple fluids. At the heart of our platform is a fully differentiable physics simulator, FluidEngine, providing GPU-accelerated simulations and gradient calculations for various material types and their couplings, extending the scope of the existing differentiable simulation engines. We identify several challenges for fluid manipulation learning by evaluating a set of reinforcement learning and trajectory optimization methods on our platform. To address these challenges, we propose several domain-specific optimization schemes coupled with differentiable physics, which are empirically shown to be effective in tackling optimization problems featured by fluid system's non-convex and nonsmooth properties. Furthermore, we demonstrate reasonable sim-to-real transfer by deploying optimized trajectories in real-world settings. FluidLab is publicly available at: https://fluidlab2023.github.io.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

FluidLab: A Differentiable Environment for Benchmarking Complex Fluid Manipulation

Zhou¹,

Zhu²,

Xu³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…This paper delves into the model-based setting (Chua et al, 2018;Janner et al, 2019;Kaiser et al, 2019;Zhang, 2022;Hafner et al, 2023), where a learned model is employed to train a control policy. Recent approaches (Mora et al, 2021;Suh et al, 2022a,b;Xu et al, 2022) based on differentiable simulators (Freeman et al, 2021;Heiden et al, 2021b) assume that gradients of simulation outcomes w.r.t. actions are explicitly given.…”

Section: Related Workmentioning

confidence: 99%

“…When the model is good enough, a small h may not fully leverage the accurate gradient information. As evidence, approaches (Xu et al, 2022;Mora et al, 2021) based on differentiable simulators typically adopt longer unrolls compared to model-based approaches. Therefore, with SN, more accurate multi-step predictions should enable more efficient learning without making the underlying optimization process harder.…”

Section: Benefit Of Smoothness Regularizationmentioning

confidence: 99%

New subspace minimization conjugate gradient methods based on regularization model for unconstrained optimization

Zhao

Liu

2020

Numer Algor

View full text Add to dashboard Cite

ReParameterization (RP) Policy Gradient Methods (PGMs) have been widely adopted for continuous control tasks in robotics and computer graphics. However, recent studies have revealed that, when applied to long-term reinforcement learning problems, model-based RP PGMs may experience chaotic and non-smooth optimization landscapes with exploding gradient variance, which leads to slow convergence. This is in contrast to the conventional belief that reparameterization methods have low gradient estimation variance in problems such as training deep generative models. To comprehend this phenomenon, we conduct a theoretical examination of model-based RP PGMs and search for solutions to the optimization difficulties. Specifically, we analyze the convergence of the model-based RP PGMs and pinpoint the smoothness of function approximators as a major factor that affects the quality of gradient estimation. Based on our analysis, we propose a spectral normalization method to mitigate the exploding variance issue caused by long model unrolls. Our experimental results demonstrate that proper normalization significantly reduces the gradient variance of model-based RP PGMs. As a result, the performance of the proposed method is comparable or superior to other gradient estimators, such as the Likelihood Ratio (LR) gradient estimator. Our code is available at https://github.com/agentification/RP_PGM.

show abstract