Robotics: Science and Systems XVII 2021
DOI: 10.15607/rss.2021.xvii.030
|View full text |Cite
|
Sign up to set email alerts
|

Inferring Objectives in Continuous Dynamic Games from Noise-Corrupted Partial State Observations

Abstract: Robots and autonomous systems must interact with one another and their environment to provide high-quality services to their users. Dynamic game theory provides an expressive theoretical framework for modeling scenarios involving multiple agents with differing objectives interacting over time. A core challenge when formulating a dynamic game is designing objectives for each agent that capture desired behavior. In this paper, we propose a method for inferring parametric objective models of multiple agents based… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(16 citation statements)
references
References 18 publications
0
16
0
Order By: Relevance
“…In [31], state and input constraints were also considered in a maximum-entropy residual-minimization framework. In [32], agents' cost functions were estimated under partial observability.…”
Section: B Multi-agent Irlmentioning
confidence: 99%
“…In [31], state and input constraints were also considered in a maximum-entropy residual-minimization framework. In [32], agents' cost functions were estimated under partial observability.…”
Section: B Multi-agent Irlmentioning
confidence: 99%
“…In particular, scenarios in which two groups of players have opposing objectives, such as robust control problems and pursuitevasion games, are often formulated as zero-sum dynamic games [10,17]. Meanwhile, problems in which multiple players have only partially conflicting objectives, such as path planning in busy traffic, are posed as general-sum dynamic games [11,18]. Although solutions to continuoustime dynamic games are characterized by coupled Hamilton-Jacobi-Bellman (HJB) PDEs [8,9,19], solving these equations is typically intractable due to the so-called "curse of dimensionality," [20] i.e., their computation time grows exponentially in the state space dimension.…”
Section: B Multi-player Path Planning Via Dynamic Gamesmentioning
confidence: 99%
“…In this work, we presume that the ego player knows other players' objectives L i . While this is certainly a strong assumption in practice, recent work has established that it is possible to infer unknown parameters of players' objectives in such games efficiently [18,25]. Thus equipped, we now define the Nash equilibrium of the GTP-SLAM problem.…”
Section: A Open-loop Nash Equilibriamentioning
confidence: 99%
“…As ellipsoidal overapproximations seem less suitable to model human behavior and may furthermore be conservative in small distances (where interaction is more pronounced), we instead extend the collision avoidance formulation of [14], which involves no approximation of rectangular obstacles. b) Online learning based on an optimal control model We propose an online learning methodology to continually update our estimate of other player's costs and constraints by adapting standard inverse optimal control methodologies such as [15]- [17] to our game-theoretical framework. We show empirically that this approach performs well in practice, demonstrating successful closed-loop navigation, where the certainty-equivalent controller exhibits suboptimal or even dangerous behavior.…”
Section: A Background and Motivationmentioning
confidence: 99%
“…We have thus far assumed full access to the human cost J ν and private constraint functions h ν , ν ∈ H. We now relax this assumption by introducing a practical methodology for learning this information online from observed data. Our approach is similar in spirit to the methodologies proposed in [15]- [17], although there are some key differences. First of all, we parametrize not only the cost but also the constraints.…”
Section: Online Learning Of Parametersmentioning
confidence: 99%