We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multiagent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.
Energy based models (EBMs) are appealing due to their generality and simplicity in likelihood modeling, but have been traditionally difficult to train. We present techniques to scale MCMC based EBM training on continuous neural networks, and we show its success on the high-dimensional data domains of ImageNet32x32, ImageNet128x128, CIFAR-10, and robotic hand trajectories, achieving better samples than other likelihood models and nearing the performance of contemporary GAN approaches, while covering all modes of the data. We highlight some unique capabilities of implicit generation such as compositionality and corrupt image reconstruction and inpainting. Finally, we show that EBMs are useful models across a wide variety of tasks, achieving state-of-the-art out-of-distribution classification, adversarially robust classification, state-of-the-art continual online class learning, and coherent long term predicted trajectory rollouts.
Figure 1: Controllers based on high-level features can be modified to create new styles, transferred to new characters and are robust to interactive changes in anthropometry. left to right: normal walking, "sad" walking, asymmetric character, "baby," ostrich, dinosaur. AbstractThis paper introduces an approach to control of physics-based characters based on high-level features of movement, such as centerof-mass, angular momentum, and end-effectors. Objective terms are used to control each feature, and are combined by a prioritization algorithm. We show how locomotion can be expressed in terms of a small number of features that control balance and endeffectors. This approach is used to build controllers for human balancing, standing jump, and walking. These controllers provide numerous benefits: human-like qualities such as arm-swing, heeloff, and hip-shoulder counter-rotation emerge automatically during walking; controllers are robust to changes in body parameters; control parameters and goals may be modified at run-time; control parameters apply to intuitive properties such as center-of-mass height; and controllers may be mapped onto entirely new bipeds with different topology and mass distribution, without modifications to the controller itself. No motion capture or off-line optimization process is used.
Figure 1: Interactive locomotion control over varied terrain. Gait, footsteps, and transitions are automatically generated, based on userspecified goals, such as direction, step length, and step duration. In the above example, a user steers the biped across uneven terrain with gaps, steps, and inclines. AbstractThis paper presents a physics-based locomotion controller based on online planning. At each time-step, a planner optimizes locomotion over multiple phases of gait. Stance dynamics are modeled using a simplified Spring-Load Inverted (SLIP) model, while flight dynamics are modeled using projectile motion equations. Full-body control at each instant is optimized to match the instantaneous plan values, while also maintaining balance. Different types of gaits, including walking, running, and jumping, emerge automatically, as do transitions between different gaits. The controllers can traverse challenging terrain and withstand large external disturbances, while following high-level user commands at interactive rates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.