Reinforcement Learning (RL) methods are widely used for dynamic control tasks. In many cases, these are high risk tasks where the trial and error process may select actions which execution from unsafe states can be catastrophic. In addition, many of these tasks have continuous state and action spaces, making the learning problem harder and unapproachable with conventional RL algorithms. So, when the agent begins to interact with a risky and large state-action space environment, an important question arises: how can we avoid that the exploration of the state-action space causes damages in the learning (or other) systems. In this paper, we define the concept of risk and address the problem of safe exploration in the context of RL. Our notion of safety is concerned with states that can lead to damage. Moreover, we introduce an algorithm that safely improves suboptimal but robust behaviors for continuous state and action control tasks, and that learns efficiently from the experience gathered from the environment. We report experimental results using the helicopter hovering task from the RL Competition.
Abstract-In this paper we present an approach for the control of autonomous robots, based on Automated Planning (AP) techniques, where a control architecture was developed (ROPEM: RObot Plan Execution with Monitoring). The proposed architecture is composed of a set of modules that integrates deliberation with a standard planner, execution, monitoring and replanning. We avoid robotic-device and platform dependency by using a low level control layer, implemented in the Player framework, separated from the high level task execution that depends on the domain we are working on; that way we also ensure reusability of the high and low level layers. As robot task execution is non-deterministic, we can not predict the result of performing a given action and for that reason we also use a module that supervises the execution and detects when we have reached the goals or an unexpected state. Separated from the execution, we included a planning module in charge of determining the actions that will let the robot achieve its high level goals. In order to test the performance of our contribution we conducted a set of experiments on the International Planning Competition (IPC) domain Rovers, with a real robot (Pioneer P3DX). We tested the planning/replanning capabilities of the ROPEM architecture with different controlled sources of uncertainty.
Exploiting the use-dependent plasticity of our neuromuscular system, neuro-rehabilitation therapies are devised to help patients that suffer from injuries or diseases in this system, such as those caused by brain damage before or during birth or in the first years of life (e.g. due to cerebral palsy or obstetric brachial plexus palsy). These therapies take advantage of the fact that the motor activity alters the properties of our neurons and muscles, including the pattern of their connectivity, and thus their functionality. Hence, a sensor-motor treatment where the patient makes certain movements, will help her to (re)learn how to move the affected body parts. But this traditional rehabilitation processes come at a cost: therapies are usually repetitive and lengthy, reducing motivation and adherence to the treatment and thus limiting the benefits for the patients. This paper describes the motivation, experiences and current efforts towards the final development of THERAPIST, a socially interactive robot for neuro-rehabilitation assistance. Our starting hypothesis was that patients could get consistently engaged in a therapeutic non-physical interaction with a robot, facilitating the design of new therapies that should improve the patient recovery time and reduce the overall socio-economic costs. This hypothesis was validated by our initial experimental studies, which showed that pediatric patients can be easily driven into highly attentive and collaborating attitudes by letting them interact with a robot. However, in order to be safe and robust, this robot was teleoperated, requiring a great effort on supervision from clinic professionals. The development of a real socially interactive robot will require the intersection of multiple challenging directions of research that we are currently exploring.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.