Distributed Reinforcement Learning of Targeted Grasping with Active Vision for Mobile Manipulators

Fujita, Yasuhiro; Uenishi, Kota; Ummadisingu, Avinash; Nagarajan, Prabhat; Masuda, Shimpei; Castro, Mario Ynocente

doi:10.1109/iros45743.2020.9341605

Cited by 9 publications

(7 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Furthermore, the authors in works [ 90 , 91 , 92 ] assumed prior knowledge and relied on a target object with a specified color to be retrieved. In contrast, Fujita et al [ 93 ] accepted the target object as an image instead of a segmentation module [ 91 , 92 ]. A deep RL system based on active vision has been used to retrieve in dense clutter.…”

Section: Critical Reviewmentioning

confidence: 99%

Review of Learning-Based Robotic Manipulation in Cluttered Environments

Mohammed

Kwek

Chua

et al. 2022

Sensors

View full text Add to dashboard Cite

Robotic manipulation refers to how robots intelligently interact with the objects in their surroundings, such as grasping and carrying an object from one place to another. Dexterous manipulating skills enable robots to assist humans in accomplishing various tasks that might be too dangerous or difficult to do. This requires robots to intelligently plan and control the actions of their hands and arms. Object manipulation is a vital skill in several robotic tasks. However, it poses a challenge to robotics. The motivation behind this review paper is to review and analyze the most relevant studies on learning-based object manipulation in clutter. Unlike other reviews, this review paper provides valuable insights into the manipulation of objects using deep reinforcement learning (deep RL) in dense clutter. Various studies are examined by surveying existing literature and investigating various aspects, namely, the intended applications, the techniques applied, the challenges faced by researchers, and the recommendations adopted to overcome these obstacles. In this review, we divide deep RL-based robotic manipulation tasks in cluttered environments into three categories, namely, object removal, assembly and rearrangement, and object retrieval and singulation tasks. We then discuss the challenges and potential prospects of object manipulation in clutter. The findings of this review are intended to assist in establishing important guidelines and directions for academics and researchers in the future.

show abstract

Section: Critical Reviewmentioning

confidence: 99%

Review of Learning-Based Robotic Manipulation in Cluttered Environments

Mohammed

Kwek

Chua

et al. 2022

Sensors

View full text Add to dashboard Cite

show abstract

“…In the preceding deterministic MDP formulation, we aim at solving a goal-reaching RL problem (Kaelbling, 1993b;Sutton et al, 2011;Andrychowicz et al, 2017;Andreas et al, 2017;Pong et al, 2018;Ghosh et al, 2019;Eysenbach et al, 2020aEysenbach et al, , 2020bKadian et al, 2020;Fujita et al, 2020;Chebotar et al, 2021;Khazatsky et al, 2021) or a planning problem (Bertsekas & Tsitsiklis, 1996;Boutilier et al, 1999;Sutton et al, 1999;Boutilier et al, 2000;Rintanen & Hoffmann, 2001;LaValle, 2006;Russell & Norvig, 2009;Nasiriany et al, 2019). We say a Q-function is successful if its associated greedy policy (Sutton & Barto, 2018)…”

Section: Successful Q-functionsmentioning

confidence: 99%

“…Goal-Conditioned RL Goal-conditioned RL, the problem of learning a policy that reaches certain goal states, has been empirically studied in many prior works (Kaelbling, 1993b;Sutton et al, 2011;Andrychowicz et al, 2017;Fu et al, 2018;Pong et al, 2018;Ghosh et al, 2019;Eysenbach et al, 2020aEysenbach et al, , 2020bKadian et al, 2020;Fujita et al, 2020;Chebotar et al, 2021;Khazatsky et al, 2021). The goal-conditioned RL is closely related to the sparse reward setting in our framework, where the agent only receives terminal rewards at the terminal (goal) states.…”

Section: Reward Designmentioning

confidence: 99%

Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning

Zhai¹,

Baek²,

Zhou³

et al. 2022

jair

View full text Add to dashboard Cite

Many goal-reaching reinforcement learning (RL) tasks have empirically verified that rewarding the agent on subgoals improves convergence speed and practical performance. We attempt to provide a theoretical framework to quantify the computational benefits of rewarding the completion of subgoals, in terms of the number of synchronous value iterations. In particular, we consider subgoals as one-way intermediate states, which can only be visited once per episode and propose two settings that consider these one-way intermediate states: the one-way single-path (OWSP) and the one-way multi-path (OWMP) settings. In both OWSP and OWMP settings, we demonstrate that adding intermediate rewards to subgoals is more computationally efficient than only rewarding the agent once it completes the goal of reaching a terminal state. We also reveal a trade-off between computational complexity and the pursuit of the shortest path in the OWMP setting: adding intermediate rewards significantly reduces the computational complexity of reaching the goal but the agent may not find the shortest path, whereas with sparse terminal rewards, the agent finds the shortest path at a significantly higher computational cost. We also corroborate our theoretical results with extensive experiments on the MiniGrid environments using Q-learning and some popular deep RL algorithms.

show abstract

“…In recent years, the application of deep reinforcement learning [ 3 , 4 , 5 , 6 , 7 , 8 ] in the robot field [ 9 , 10 ] has deepened and has been widely used in grasping [ 11 , 12 ], assembly [ 13 ], path planning [ 14 , 15 ], and other fields [ 16 , 17 ]. A few scholars have used deep reinforcement learning to study the constant force-tracking process, showing the great potential for applying deep reinforcement learning to solving the issue of constant force-tracking.…”

Section: Introductionmentioning

confidence: 99%

Constant Force-Tracking Control Based on Deep Reinforcement Learning in Dynamic Auscultation Environment

Zhang

Chen

Shu

et al. 2023

Sensors

View full text Add to dashboard Cite

Intelligent medical robots can effectively help doctors carry out a series of medical diagnoses and auxiliary treatments and alleviate the current shortage of social personnel. Therefore, this paper investigates how to use deep reinforcement learning to solve dynamic medical auscultation tasks. We propose a constant force-tracking control method for dynamic environments and a modeling method that satisfies physical characteristics to simulate the dynamic breathing process and design an optimal reward function for the task of achieving efficient learning of the control strategy. We have carried out a large number of simulation experiments, and the error between the tracking of normal force and expected force is basically within ±0.5 N. The control strategy is tested in a real environment. The preliminary results show that the control strategy performs well in the constant force-tracking of medical auscultation tasks. The contact force is always within a safe and stable range, and the average contact force is about 5.2 N.

show abstract

Distributed Reinforcement Learning of Targeted Grasping with Active Vision for Mobile Manipulators

Cited by 9 publications

References 21 publications

Review of Learning-Based Robotic Manipulation in Cluttered Environments

Review of Learning-Based Robotic Manipulation in Cluttered Environments

Computational Benefits of Intermediate Rewards for Goal-Reaching Policy Learning

Constant Force-Tracking Control Based on Deep Reinforcement Learning in Dynamic Auscultation Environment

Contact Info

Product

Resources

About