2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI) 2019
DOI: 10.1109/hri.2019.8673256
|View full text |Cite
|
Sign up to set email alerts
|

On the Utility of Model Learning in HRI

Abstract: Fundamental to robotics is the debate between model-based and model-free learning: should the robot build an explicit model of the world, or learn a policy directly? In the context of HRI, part of the world to be modeled is the human. One option is for the robot to treat the human as a black box and learn a policy for how they act directly. But it can also model the human as an agent, and rely on a "theory of mind" to guide or bias the learning (grey box). We contribute a characterization of the performance of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 41 publications
(32 citation statements)
references
References 26 publications
0
32
0
Order By: Relevance
“…tion Learning or Behavioral Cloning 47 is an attempt to mimic the policy of another human or agent. On the other hand, Inverse Reinforcement Learning 48,49 aims to recover reward functions. Modeling the impact of other agents have also been shown to be useful to stabilize the training process for multi-agent reinforcement learning 50 .…”
Section: Modelling Other Agents Several Studies Have Focused On Modementioning
confidence: 99%
“…tion Learning or Behavioral Cloning 47 is an attempt to mimic the policy of another human or agent. On the other hand, Inverse Reinforcement Learning 48,49 aims to recover reward functions. Modeling the impact of other agents have also been shown to be useful to stabilize the training process for multi-agent reinforcement learning 50 .…”
Section: Modelling Other Agents Several Studies Have Focused On Modementioning
confidence: 99%
“…If slowing down is necessary for collision avoidance or because a stop sign is coming up, that is what the planner will do. And indeed, prior work has shown such predictors to be preferable in highly interactive domains, depending on how one collects their training data [4].…”
Section: Test Datamentioning
confidence: 99%
“…following a Gaussian distribution with covariance matrix ÎŁ = diag(0.75, 0.75). We are interested in the task of steering the robot into a goal set [5,7] × [5,7] To construct the abstraction-based controller, we partition the state space with discretization parameters (0.5, 0.5), and the input space with (0.1, 0.1). This leads to a total number of states equal to |𝑋 | = 1600 and a number of inputs equal to |𝑈 | = 441 (by including the upper and lower limits of the input space as additional input) leading to a complexity of |𝑋 × 𝑈 | = 705600.…”
Section: Benchmarks and Performancementioning
confidence: 99%
“…Third, the use of neural networks to guide the design of the abstraction-based controller opens the door to encode the human's preferences for how a dynamical system should act. Such human's preference is crucial for several real-world settings in which a human user or operator interacts with an autonomous dynamical system [5]. Current research found that human preferences can be efficiently captured using expert demonstrations and preference-based learning which can be hard to be accurately capture in the form of a logical formulae or a reward function [10].…”
Section: Introductionmentioning
confidence: 99%