An Algorithmic Perspective on Imitation Learning

Osa, Takayuki; Pajarinen, Joni; Neumann, Gerhard; Bagnell, J. Andrew; Abbeel, Pieter; Peters, Jan

doi:10.1561/2300000053

Cited by 385 publications

(329 citation statements)

References 138 publications

Supporting

Mentioning

302

Contrasting

Unclassified

Order By: Relevance

“…These methods attempt to generalize from previously observed interactions to predict multi-agent behavior in new situations. Forecasting is related to Imitation Learning [25], which learns a model to mimic demonstrated behavior. In contrast to some Imitation Learning methods, e.g.…”

Section: Related Workmentioning

confidence: 99%

PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings

Rhinehart

McAllister

Kitani

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

321

236

View full text Add to dashboard Cite

For autonomous vehicles (AVs) to behave appropriately on roads populated by human-driven vehicles, they must be able to reason about the uncertain intentions and decisions of other drivers from rich perceptual information. Towards these capabilities, we present a probabilistic forecasting model of future interactions of multiple agents. We perform both standard forecasting and conditional forecasting with respect to the AV's goals. Conditional forecasting reasons about how all agents will likely respond to specific decisions of a controlled agent. We train our model on real and simulated data to forecast vehicle trajectories given past positions and LIDAR. Our evaluation shows that our model is substantially more accurate in multi-agent driving scenarios compared to existing state-of-the-art. Beyond its general ability to perform conditional forecasting queries, we show that our model's predictions of all agents improve when conditioned on knowledge of the AV's intentions, further illustrating its capability to model agent interactions.

show abstract

Section: Related Workmentioning

confidence: 99%

PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings

Rhinehart

McAllister

Kitani

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

321

236

View full text Add to dashboard Cite

show abstract

“…Since we used 3 features, θ's dimensionality was 3, leading to a possible set Θ equivalent to the 3-fold Cartesian product of the values above. After normalizing to norm 1, we were left with 19 unique θ vectors in Θ, weighing the three features in different proportions, as shown in Figures 3,7,8,9,and 10. Our discretization scheme ensured an approximately uniform sampling on the positive quadrant of the unit sphere.…”

Section: Appendix a Practical Considerationsmentioning

confidence: 99%

Quantifying Hypothesis Space Misspecification in Learning From Human–Robot Demonstrations and Physical Corrections

et al. 2020

View full text Add to dashboard Cite

“…Most similar to our setting is inverse reinforcement learning (IRL), an instance of LfD where the robot learns the correct reward function from human demonstrations [3], [4]. Prior works on IRL generally assume that every human has a single, fixed teaching strategy [5]: the human teaches by providing optimal demonstrations, and any sub-optimal human behavior is interpreted as noise [6]- [9]. Alternatively, robots can also learn about the human while learning from that human.…”

Section: Related Work a Robots Learning From Humansmentioning

confidence: 99%

“…To compare our learning with strategy uncertainty against the state-of-the-art in a realistic problem setting, we performed a simulated user study. We here consider an instance of inverse reinforcement learning (IRL): the human demonstrates a policy, and the robot attempts to infer the human's reward function from that demonstrated policy [3]- [5]. Unlike the example in Section IV-C, now θ * (the human's reward parameters) and φ * (the human's demonstration strategy) lie in continuous spaces.…”

Section: Robot Learning Simulationsmentioning

confidence: 99%

Enabling Robots to Infer How End-Users Teach and Learn Through Human-Robot Interaction

Losey

O’Malley

2019

IEEE Robot. Autom. Lett.

View full text Add to dashboard Cite

During human-robot interaction (HRI), we want the robot to understand us, and we want to intuitively understand the robot. In order to communicate with and understand the robot, we can leverage interactions, where the human and robot observe each other's behavior. However, it is not always clear how the human and robot should interpret these actions: a given interaction might mean several different things. Within today's state-of-the-art, the robot assigns a single interaction strategy to the human, and learns from or teaches the human according to this fixed strategy. Instead, we here recognize that different users interact in different ways, and so one size does not fit all. Therefore, we argue that the robot should maintain a distribution over the possible human interaction strategies, and then infer how each individual end-user interacts during the task. We formally define learning and teaching when the robot is uncertain about the human's interaction strategy, and derive solutions to both problems using Bayesian inference. In examples and a benchmark simulation, we show that our personalized approach outperforms standard methods that maintain a fixed interaction strategy.

show abstract

An Algorithmic Perspective on Imitation Learning

Cited by 385 publications

References 138 publications

PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings

PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings

Quantifying Hypothesis Space Misspecification in Learning From Human–Robot Demonstrations and Physical Corrections

Enabling Robots to Infer How End-Users Teach and Learn Through Human-Robot Interaction

Contact Info

Product

Resources

About