2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids) 2018
DOI: 10.1109/humanoids.2018.8624948
|View full text |Cite
|
Sign up to set email alerts
|

Safe- to-Explore State Spaces: Ensuring Safe Exploration in Policy Search with Hierarchical Task Optimization

Abstract: Policy search reinforcement learning allows robots to acquire skills by themselves. However, the learning procedure is inherently unsafe as the robot has no a-priori way to predict the consequences of the exploratory actions it takes. Therefore, exploration can lead to collisions with the potential to harm the robot and/or the environment. In this work we address the safety aspect by constraining the exploration to happen in safeto-explore state spaces. These are formed by decomposing target skills (e.g., gras… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 27 publications
0
5
0
Order By: Relevance
“…The prior work constrains the agent to explore in Safe-To-Explore-State-Spaces (STESS) [4], which decomposes a robotic skill into prioritized elemental tasks and a normalized Radial Basis Function (RBF) [4] network is used to represent the learning policy. We continue with STESS framework in this paper and further construct several phases for different period constraints.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…The prior work constrains the agent to explore in Safe-To-Explore-State-Spaces (STESS) [4], which decomposes a robotic skill into prioritized elemental tasks and a normalized Radial Basis Function (RBF) [4] network is used to represent the learning policy. We continue with STESS framework in this paper and further construct several phases for different period constraints.…”
Section: Related Workmentioning
confidence: 99%
“…We take advantage of STESS [4] to enable safe exploration of lower ranked RL task in the null space of higher ranked tasks.All tasks are solved in the acceleration space and the objective function can be formulated as…”
Section: B Reinforcement Learning In Null Spacementioning
confidence: 99%
See 2 more Smart Citations
“…The motion planning and control of the robot are completed in MoveIt, an open-source project in ROS. In this case, we modified an open-source unified robot description format (URDF) model of YuMi provided by Lundell et al [35]. Because YuMi's manipulators and grippers are controlled independently based on different IP addresses, we divided the whole robot into four motion planning groups after URDF remodeling: left arm, right arm, left hand and right hand.…”
Section: Robot-control Subsystem 1) Basic Configurationmentioning
confidence: 99%