BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

Ivanovic, Boris; Harrison, J. Michael; Sharma, Anita; Chen, Mo; Pavone, Marco

doi:10.1109/icra.2019.8794206

Cited by 38 publications

(32 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Intuitively, if the starting state s 0 for the agent is close to the end-goal state, the training would become easier, which forms a natural curriculum for training tasks whose difficulty depends on a proper distance between initial state and the end-goal state. This method has been shown effective in control tasks with sparse rewards [Florensa et al, 2017, Ivanovic et al, 2019. To simplify implementation, even though we only need a single initial state s 0 which is independent of time, we still use a Base-RNN, fb , to output it.…”

Section: Manually Designed Curriculamentioning

confidence: 99%

“…Automatic curriculum learning (ACL) for deep reinforcement learning (DRL) [Portelas et al, 2020a] has recently emerged as a promising tool to learn how to adapt an agent's learning tasks to its capabilities during training. ACL can be applied to DRL in various ways, including adapting initial states [Florensa et al, 2017, Ivanovic et al, 2019, shaping reward functions [Bellemare et al, 2016, Shyam et al, 2019, and generating goals [Lair et al, 2019, Sukhbaatar et al, 2017.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning Multi-Objective Curricula for Deep Reinforcement Learning

Kang¹,

Liu²,

Gupta³

et al. 2021

Preprint

View full text Add to dashboard Cite

Various automatic curriculum learning (ACL) methods have been proposed to improve the sample efficiency and final performance of deep reinforcement learning (DRL). They are designed to control how a DRL agent collects data, which is inspired by how humans gradually adapt their learning processes to their capabilities. For example, ACL can be used for subgoal generation, reward shaping, environment generation, or initial state generation. However, prior work only considers curriculum learning following one of the aforementioned predefined paradigms. It is unclear which of these paradigms are complementary, and how the combination of them can be learned from interactions with the environment. Therefore, in this paper, we propose a unified automatic curriculum learning framework to create multi-objective but coherent curricula that are generated by a set of parametric curriculum modules. Each curriculum module is instantiated as a neural network and is responsible for generating a particular curriculum. In order to coordinate those potentially conflicting modules in unified parameter space, we propose a multi-task hyper-net learning framework that uses a single hyper-net to parameterize all those curriculum modules. In addition to existing hand-designed curricula paradigms, we further design a flexible memory mechanism to learn an abstract curriculum, which may otherwise be difficult to design manually. We evaluate our method on a series of robotic manipulation tasks and demonstrate its superiority over other state-of-the-art ACL methods in terms of sample efficiency and final performance.

show abstract

Section: Manually Designed Curriculamentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Learning Multi-Objective Curricula for Deep Reinforcement Learning

Kang¹,

Liu²,

Gupta³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Work in proposes a curriculum of start states for a constant goal state while the work in [16] estimates the next or intermediary goal of the appropriate complexity using a Generative Adversarial Network (GAN). [17] uses the backward reachability decomposition between different goal and start states to estimate reachability and obtain a measure of complexity of the task [18] quantifies the complexity of different training environments (teacher network) and generates a curriculum for the student networks to learn. [19] masks certain features of the goal vector to reduce its complexity and trains the agent on the reduced complexity goal states.…”

Section: Curriculum Learningmentioning

confidence: 99%

Curriculum-Based Deep Reinforcement Learning for Adaptive Robotics: A Mini-Review

Gupta¹,

Najjaran²

2021

Int J Robot Eng

View full text Add to dashboard Cite

To facilitate the current and future automation needs, the research community constantly seeks to develop dynamic and efficient autonomous decision-making agents. These agents must not only be robust to modeling uncertainties, internal and external changes, but can adapt to a range of tasks also. Recent progress in deep reinforcement learning has corroborated to its potential to train such autonomous and robust agents. At the same time, the introduction of curriculum learning has made the reinforcement learning process significantly more efficient and allowed for training on much broader tasks. This combination, Curriculum-based Deep Reinforcement Learning (CDRL), presents a powerful solution to meet the increasing complexity of today's automation industry that demands highly intelligent machines. With this work we present a concise review of CDRL methods within the context of their application to the field of adaptive robotics.

show abstract

“…Similar to reverse curriculum generation, [49] introduces BaRC algorithm that utilized the capacity of the agent to be able to calculate the backward reachability set of states based on an approximate physical model of the system. The algorithm requires decomposition methods and software tools to efficiently calculate the backward reachability from intermediate states and modify the start state distribution.…”

Section: Curriculum Through Intermediate Goalsmentioning

confidence: 99%

“…Teacher-student CL [78] Adding Decimal Numbers Solving Minecraft Mazes ALP-GMM [92] BipedalWalker GoalGAN [29] Robotic AntMulti-Path Point-mass Maze Reverse Curriculum [30] N-dimensional Point Mass (Locomotion and Navigation) Inserting Ring on a Peg Key in to a Keyhole Point-mass navigation Asymmetric self-play with intrinsic motivation [106] Mazebase LightKey RLLab Swimmer Gather Mountain Car StarCraft BaRC [49] Car motion planning Planar Quadrotor Distributed PPO [45] Planar Walker Quadruped Humanoid Value function curriculum [94] Traffic Intersection Traverse Traffic intersection approaching Mix & Match [21] DeepMind 3D Environment suite Rarity of Events [53] Doom Curriculum-HER [27] FetchReach HandManipulate with Block, Egg and Pen Curriculum Graph [110] Gridworld Block Dude Ms Pac Man Curriculum goal masking [26] FetchPush FetchPickandPlace Self-paced prioritized curriculum [96] Space…”

Section: Summary Conclusion and Future Workmentioning

confidence: 99%