2019 International Conference on Robotics and Automation (ICRA) 2019
DOI: 10.1109/icra.2019.8794206
|View full text |Cite
|
Sign up to set email alerts
|

BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

Abstract: Model-free Reinforcement Learning (RL) offers an attractive approach to learn control policies for highdimensional systems, but its relatively poor sample complexity often necessitates training in simulated environments. Even in simulation, goal-directed tasks whose natural reward function is sparse remain intractable for state-of-the-art model-free algorithms for continuous control. The bottleneck in these tasks is the prohibitive amount of exploration required to obtain a learning signal from the initial sta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 38 publications
(32 citation statements)
references
References 30 publications
0
30
0
Order By: Relevance
“…Intuitively, if the starting state s 0 for the agent is close to the end-goal state, the training would become easier, which forms a natural curriculum for training tasks whose difficulty depends on a proper distance between initial state and the end-goal state. This method has been shown effective in control tasks with sparse rewards [Florensa et al, 2017, Ivanovic et al, 2019. To simplify implementation, even though we only need a single initial state s 0 which is independent of time, we still use a Base-RNN, fb , to output it.…”
Section: Manually Designed Curriculamentioning
confidence: 99%
See 1 more Smart Citation
“…Intuitively, if the starting state s 0 for the agent is close to the end-goal state, the training would become easier, which forms a natural curriculum for training tasks whose difficulty depends on a proper distance between initial state and the end-goal state. This method has been shown effective in control tasks with sparse rewards [Florensa et al, 2017, Ivanovic et al, 2019. To simplify implementation, even though we only need a single initial state s 0 which is independent of time, we still use a Base-RNN, fb , to output it.…”
Section: Manually Designed Curriculamentioning
confidence: 99%
“…Automatic curriculum learning (ACL) for deep reinforcement learning (DRL) [Portelas et al, 2020a] has recently emerged as a promising tool to learn how to adapt an agent's learning tasks to its capabilities during training. ACL can be applied to DRL in various ways, including adapting initial states [Florensa et al, 2017, Ivanovic et al, 2019, shaping reward functions [Bellemare et al, 2016, Shyam et al, 2019, and generating goals [Lair et al, 2019, Sukhbaatar et al, 2017.…”
Section: Introductionmentioning
confidence: 99%
“…Work in proposes a curriculum of start states for a constant goal state while the work in [16] estimates the next or intermediary goal of the appropriate complexity using a Generative Adversarial Network (GAN). [17] uses the backward reachability decomposition between different goal and start states to estimate reachability and obtain a measure of complexity of the task [18] quantifies the complexity of different training environments (teacher network) and generates a curriculum for the student networks to learn. [19] masks certain features of the goal vector to reduce its complexity and trains the agent on the reduced complexity goal states.…”
Section: Curriculum Learningmentioning
confidence: 99%
“…Similar to reverse curriculum generation, [49] introduces BaRC algorithm that utilized the capacity of the agent to be able to calculate the backward reachability set of states based on an approximate physical model of the system. The algorithm requires decomposition methods and software tools to efficiently calculate the backward reachability from intermediate states and modify the start state distribution.…”
Section: Curriculum Through Intermediate Goalsmentioning
confidence: 99%
“…Teacher-student CL [78] Adding Decimal Numbers Solving Minecraft Mazes ALP-GMM [92] BipedalWalker GoalGAN [29] Robotic AntMulti-Path Point-mass Maze Reverse Curriculum [30] N-dimensional Point Mass (Locomotion and Navigation) Inserting Ring on a Peg Key in to a Keyhole Point-mass navigation Asymmetric self-play with intrinsic motivation [106] Mazebase LightKey RLLab Swimmer Gather Mountain Car StarCraft BaRC [49] Car motion planning Planar Quadrotor Distributed PPO [45] Planar Walker Quadruped Humanoid Value function curriculum [94] Traffic Intersection Traverse Traffic intersection approaching Mix & Match [21] DeepMind 3D Environment suite Rarity of Events [53] Doom Curriculum-HER [27] FetchReach HandManipulate with Block, Egg and Pen Curriculum Graph [110] Gridworld Block Dude Ms Pac Man Curriculum goal masking [26] FetchPush FetchPickandPlace Self-paced prioritized curriculum [96] Space…”
Section: Summary Conclusion and Future Workmentioning
confidence: 99%