Recovery RL: Safe Reinforcement Learning With Learned Recovery Zones

Thananjeyan, Brijen; Balakrishna, Ashwin; Nair, Suraj; Luo, Michael; Srinivasan, K.; Hwang, Mina; Gonzalez, Joseph E.; Ibarz, Julian; Finn, Chelsea; Goldberg, Ken

doi:10.1109/lra.2021.3070252

Cited by 115 publications

(65 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…RL with safe constraints [72] has emerged to consider applications in the real world. Some studies have refined the constraints or rewards for safe ranges in states, which can help in avoidance of unstable areas.…”

Section: Reinforcement Learning For Diverse Fieldsmentioning

confidence: 99%

“…In this study, actor-critic methods consisting of deep learning with a safe actor using features of hypo and hyperglycemia were adopted from the perspectives of reinforcement learning and switching controllers for safety. Actorcritic methods have been widely implemented [39], [70], [72]- [74] in many RL applications.…”

Section: E Reinforcement Learning For Regulation Of Glucosementioning

confidence: 99%

See 1 more Smart Citation

A Blood Glucose Control Framework Based on Reinforcement Learning With Safety and Interpretability: In Silico Validation

et al. 2021

View full text Add to dashboard Cite

Controlling blood glucose levels in diabetic patients is important for managing their health and quality of life. Several algorithms based on model predictive control and reinforcement learning (RL) have been proposed so far, most of which use prior knowledge of physiological systems, the mathematical structure of blood glucose dynamics, and many episodes including failures for training the policy network in RL. To be smoothly adopted in clinical settings, we propose a fast online learning method underlining safety and interpretability. A random forest regressor and a dual attention network were exploited for glucose prediction and extension of state variables. The soft actor-critic network to determine insulin dosing was guided by proportional-integral-derivative (PID) control in the early phase, and an adaptive safe actor with suspension and additional insulin dosing was incorporated. The performance of the models was validated using an FDA-approved type 1 diabetes simulator. The results showed comparable outcomes with PID control. Using this system, glucose dynamics could be captured despite minimal prior knowledge. The extended state variables were correlated with basic states such as glucose, insulin, and meal intake, their derivatives, and their integrals, which can be fundamental elements of mathematical modeling of physiological responses. Attention scores and attribution scores in the prediction and control models represented the focused features and the internal operation of the models with interpretability. We expect this study to provide some insights on how RL can be practically adopted in clinical environments and how interpretability can provide hints of machines' thoughts for clinical applications.INDEX TERMS blood glucose control, reinforcement learning, safe and interpretable control, in silico validation, simulation for clinical application

show abstract

Section: Reinforcement Learning For Diverse Fieldsmentioning

confidence: 99%

Section: E Reinforcement Learning For Regulation Of Glucosementioning

confidence: 99%

A Blood Glucose Control Framework Based on Reinforcement Learning With Safety and Interpretability: In Silico Validation

et al. 2021

View full text Add to dashboard Cite

show abstract

“…For the suturing task, including the works related to knot tying and needle insertion, we reported the following: [16], [70], [85], [88], [89], [161], [178], [195]- [197], [203], [243], and [246]. The pick, transfer, and place task was mainly characterized by experiments relying on pegs and rings from the fundamentals of the laparoscopic surgery training paradigm [54], [69], [91], [96], [102], [151]- [153], [163], [264] or new surgical tools [177]. A lot of the remaining works focused on tissue interaction.…”

Section: Instrument Controlmentioning

confidence: 99%

Accelerating Surgical Robotics Research: A Review of 10 Years With the da Vinci Research Kit

D'Ettorre

Mariani

Stilli

et al. 2021

IEEE Robot. Automat. Mag.

View full text Add to dashboard Cite

This article has been accepted for inclusion in a future issue of this journal. Content is final as presented, with the exception of pagination.R obotic-assisted surgery is now well established in clinical practice and has become the gold-standard clinical treatment option for several clinical indications. The field of robotic-assisted surgery is expected to grow substantially in the next decade, with a range of new robotic devices emerging to address unmet clinical needs across different specialties. A vibrant surgical robotics research community is pivotal for conceptualizing such new systems as well as for developing and training the engineers and scientists to translate them into practice. The da Vinci Research Kit (dVRK), an academic and industry collaborative effort to repurpose decommissioned da Vinci surgical systems [Intuitive Surgical Inc. (ISI), California, USA] as a research platform for surgical robotics research, has been a key initiative for addressing a barrier to entry for new research groups in surgical robotics. In this article, we present an extensive review of the publications that have been facilitated by the dVRK over the past decade. We classify research efforts into different categories and outline some of the major challenges and needs for the robotics community to maintain and build upon this initiative.

show abstract

“…In contrast to end-to-end robot policy learning [3,20,21,22], a popular approach for generating behaviors is to parameterize motions by the outputs of a perception network [4,5,10,17,23]. This decouples perception from planning and control, and enables perception systems to be trained in simulation without the need for accurate physical simulation.…”

Section: Parameterized Representations For Manipulationmentioning

confidence: 99%

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo

Kollar,

Laskey,

Stone

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Robot manipulation of unknown objects in unstructured environments is a challenging problem due to the variety of shapes, materials, arrangements and lighting conditions. Even with large-scale real-world data collection, robust perception and manipulation of transparent and reflective objects across various lighting conditions remains challenging. To address these challenges we propose an approach to performing sim-to-real transfer of robotic perception. The underlying model, SimNet, is trained as a single multi-headed neural network using simulated stereo data as input and simulated object segmentation masks, 3D oriented bounding boxes (OBBs), object keypoints and disparity as output. A key component of SimNet is the incorporation of a learned stereo sub-network that predicts disparity. SimNet is evaluated on 2D car detection, unknown object detection and deformable object keypoint detection and significantly outperforms a baseline that uses a structured light RGB-D sensor. By inferring grasp positions using the OBB and keypoint predictions, SimNet can be used to perform end-to-end manipulation of unknown objects in both "easy" and "hard" scenarios using our fleet of Toyota HSR robots in four home environments. In unknown object grasping experiments, the predictions from the baseline RGB-D network and SimNet enable successful grasps of most of the "easy" objects. However, the RGB-D baseline only grasps 35% of the "hard" (e.g., transparent) objects, while SimNet grasps 95%, suggesting that SimNet can enable robust manipulation of unknown objects, including transparent objects, in unknown environments. Additional visualizations and materials are located at https://tinyurl.com/simnet-corl.

show abstract

Recovery RL: Safe Reinforcement Learning With Learned Recovery Zones

Cited by 115 publications

References 12 publications

A Blood Glucose Control Framework Based on Reinforcement Learning With Safety and Interpretability: In Silico Validation

A Blood Glucose Control Framework Based on Reinforcement Learning With Safety and Interpretability: In Silico Validation

Accelerating Surgical Robotics Research: A Review of 10 Years With the da Vinci Research Kit

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo

Contact Info

Product

Resources

About