2013 IEEE/RSJ International Conference on Intelligent Robots and Systems 2013
DOI: 10.1109/iros.2013.6696349
|View full text |Cite
|
Sign up to set email alerts
|

Ensuring safety of policies learned by reinforcement: Reaching objects in the presence of obstacles with the iCub

Abstract: Given a stochastic policy learned by reinforcement, we wish to ensure that it can be deployed on a robot with demonstrably low probability of unsafe behavior. Our case study is about learning to reach target objects positioned close to obstacles, and ensuring a reasonably low collision probability. Learning is carried out in a simulator to avoid physical damage in the trial-and-error phase. Once a policy is learned, we analyze it with probabilistic model checking tools to identify and correct potential unsafe … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2014
2014
2021
2021

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 14 publications
0
7
0
Order By: Relevance
“…This way we are able to send arbitrary motions to our system, while ensuring the safety of our robot. Even with just these static objects, this has been shown to provide an interesting way to learn robot reaching behaviors through reinforcement (Pathak et al, 2013;Frank et al, 2014). The presented system has the same functionality also for arbitrary, non-static objects.…”
Section: Example: Reaching While Avoiding a Moving Obstaclementioning
confidence: 99%
“…This way we are able to send arbitrary motions to our system, while ensuring the safety of our robot. Even with just these static objects, this has been shown to provide an interesting way to learn robot reaching behaviors through reinforcement (Pathak et al, 2013;Frank et al, 2014). The presented system has the same functionality also for arbitrary, non-static objects.…”
Section: Example: Reaching While Avoiding a Moving Obstaclementioning
confidence: 99%
“…To avoid the inherent risks of contact-rich manipulation for the robot and its environment, various approaches have been proposed that can mainly be classified in three categories: designing fail-safe mechanisms [7], [8], incorporating safety criteria in the reward/cost function [9], [10] and lastly, limiting the permitted actions [11], [12]. Restricting the actions has the advantage of enabling an explicit expression of unsafe actions, while minimally intervening with the nominal function of the controller.…”
Section: A Related Workmentioning
confidence: 99%
“…In robotics, [83] and [84] propose to use model checking with an extension to estimate the probability that the properties are satisfied. To avoid modeling the software, the functional layer is checked directly as Java code in [85].…”
Section: Static Verificationmentioning
confidence: 99%