Learning Reward Functions by Integrating Human Demonstrations and Preferences

Palan, Malayandi; Shevchuk, Gleb; Landolfi, Nicholas C.; Sadigh, Dorsa

doi:10.15607/rss.2019.xv.023

Cited by 76 publications

(62 citation statements)

References 23 publications

(47 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Here, it is assumed that the collaborative agent has access to some underlying human reward function (usually inferred through IRL or inverse planning approaches). The human is modeled to act rationally with the highest probability, but with a non-zero probability of behaving sub-optimally [20,[47][48][49][50].…”

Section: First-order Mental Modelsmentioning

confidence: 99%

A Survey of Mental Modeling Techniques in Human–Robot Teaming

2020

View full text Add to dashboard Cite

Purpose of Review As robots become increasingly prevalent and capable, the complexity of roles and responsibilities assigned to them as well as our expectations for them will increase in kind. For these autonomous systems to operate safely and efficiently in human-populated environments, they will need to cooperate and coordinate with human teammates. Mental models provide a formal mechanism for achieving fluent and effective teamwork during human-robot interaction by enabling awareness between teammates and allowing for coordinated action. Recent Findings Much recent research in human-robot interaction has made use of standardized and formalized mental modeling techniques to great effect, allowing for a wider breadth of scenarios in which a robotic agent can act as an effective and trustworthy teammate. Summary This paper provides a structured overview of mental model theory and methodology as applied to human-robot teaming. Also discussed are evaluation methods and metrics for various aspects of mental modeling during human-robot interaction, as well as recent emerging applications and open challenges in the field. Keywords Human-robot teaming • Mental models • Human-robot interaction • Theory of mind This article belongs to the Topical Collection on Service and Interactive Robotics

show abstract

Section: First-order Mental Modelsmentioning

confidence: 99%

A Survey of Mental Modeling Techniques in Human–Robot Teaming

2020

View full text Add to dashboard Cite

show abstract

“…MacGlasha et, al [25] presented a system to ground natural language commands to reward functions that captured a desired task, and used natural language as an interface for specifying rewards. Palan et,al [26] used demonstrations to learn a coarse prior over the space of reward functions, to reduce the effective size of the space from which queries are generated.…”

Section: Related Workmentioning

confidence: 99%

A Reinforcement Learning-Based Framework for Robot Manipulation Skill Acquisition

Liu

Wang

et al. 2020

IEEE Access

View full text Add to dashboard Cite

This paper studies robot manipulation skill acquisition based on a proposed reinforcement learning framework. Robot can learn policy autonomously by interacting with environment with a better learning efficiency. Aiming at the manipulator operation task, a reward function design method based on objects configuration matching (OCM) is proposed. It is simple and suitable for most Pick and Place skills learning. Integrating robot and object state, high-level action set and the designed reward function, the Markov model of robot manipulator is built. An improved Proximal Policy Optimize algorithm with manipulation set as the output of Actor (MAPPO) is proposed as the main structure to construct the robot reinforcement learning framework. The framework combines with the Markov model to learn and optimize the skill policy. A same simulation environment as the real robot is set up, and three robot manipulation tasks are designed to verify the effectiveness and feasibility of the reinforcement learning framework for skill acquisition.

show abstract

“…Ma et al [ 45 ] employed the RGB image as the visual input and presented a DRL-based mapless motion planner alleviating the need of interactions between the agent and environment. A few special techniques and model structures are also used in navigation tasks, including multiple subtasks to assist reinforcement learning [ 46 ], continuous motion control based on DDPG [ 47 ] and target-driven navigation [ 48 ]. Most of the above-mentioned methods focus on the improvement of reinforcement learning structure, and the reward value is mostly sparse.…”

Section: Related Workmentioning

confidence: 99%

Learning Reward Function with Matching Network for Mapless Navigation

Zhang

Zhu

Zou

et al. 2020

Sensors

View full text Add to dashboard Cite

Deep reinforcement learning (DRL) has been successfully applied in mapless navigation. An important issue in DRL is to design a reward function for evaluating actions of agents. However, designing a robust and suitable reward function greatly depends on the designer’s experience and intuition. To address this concern, we consider employing reward shaping from trajectories on similar navigation tasks without human supervision, and propose a general reward function based on matching network (MN). The MN-based reward function is able to gain the experience by pre-training through trajectories on different navigation tasks and accelerate the training speed of DRL in new tasks. The proposed reward function keeps the optimal strategy of DRL unchanged. The simulation results on two static maps show that the DRL converge with less iterations via the learned reward function than the state-of-the-art mapless navigation methods. The proposed method performs well in dynamic maps with partially moving obstacles. Even when test maps are different from training maps, the proposed strategy is able to complete the navigation tasks without additional training.

show abstract

Learning Reward Functions by Integrating Human Demonstrations and Preferences

Cited by 76 publications

References 23 publications

A Survey of Mental Modeling Techniques in Human–Robot Teaming

A Survey of Mental Modeling Techniques in Human–Robot Teaming

A Reinforcement Learning-Based Framework for Robot Manipulation Skill Acquisition

Learning Reward Function with Matching Network for Mapless Navigation

Contact Info

Product

Resources

About