2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2016
DOI: 10.1109/iros.2016.7759328
|View full text |Cite
|
Sign up to set email alerts
|

Watch this: Scalable cost-function learning for path planning in urban environments

Abstract: In this work, we present an approach to learn cost maps for driving in complex urban environments from a very large number of demonstrations of driving behaviour by human experts. The learned cost maps are constructed directly from raw sensor measurements, bypassing the effort of manually designing cost maps as well as features. When deploying the learned cost maps, the trajectories generated not only replicate human-like driving behaviour but are also demonstrably robust against systematic errors in putative … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
131
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 112 publications
(131 citation statements)
references
References 24 publications
0
131
0
Order By: Relevance
“…IOC for path prediction Kitani et al recover human preferences (i.e., reward function) to forecast plausible paths for a pedestrian in [23] using inverse optimal control (IOC), or inverse reinforcement learning (IRL) [1,52], while [26] adapt IOC and propose a dynamic reward function to address changes in environments for sequential path predictions. Combined with a deep neural network, deep IOC/IRL has been proposed to learn non-linear reward functions and showed promising results in robot control [11] and driving [50] tasks. However, one critical assumption made in IOC frameworks, which makes them hard to be applied to general path prediction tasks, is that the goal state or the destination of agent should be given a priori, whereby feasible paths must be found to the given destination from the planning or control point of view.…”
Section: Related Workmentioning
confidence: 99%
“…IOC for path prediction Kitani et al recover human preferences (i.e., reward function) to forecast plausible paths for a pedestrian in [23] using inverse optimal control (IOC), or inverse reinforcement learning (IRL) [1,52], while [26] adapt IOC and propose a dynamic reward function to address changes in environments for sequential path predictions. Combined with a deep neural network, deep IOC/IRL has been proposed to learn non-linear reward functions and showed promising results in robot control [11] and driving [50] tasks. However, one critical assumption made in IOC frameworks, which makes them hard to be applied to general path prediction tasks, is that the goal state or the destination of agent should be given a priori, whereby feasible paths must be found to the given destination from the planning or control point of view.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, Wulfmeier et al [29][30][31] proposed deep IRL, which combined MaxEnt-IRL with a deep neural network architecture to find nonlinear reward functions. However, their method suffers from the same three problems as MaxEnt-IRL.…”
Section: Related Workmentioning
confidence: 99%
“…However, we could not compare our method with deep MaxEnt-IRL [29][30][31] because deep MaxEnt-IRL has to find an optimal policy for every iteration and it took enormous time on the games of Atari 2600. Therefore, we selected PI_LOC [9] for comparison because D b can be used to evaluate the partition function.…”
Section: Atari Gamesmentioning
confidence: 99%
“…For path planning, the trajectory with the highest probability according to the learned model is chosen with the goal of a close imitation of pedestrian motion. Wulfmeier et al [12] present a similar approach using deep IRL instead of a combination of classical features in order to learn how to drive an autonomous car through static environments.…”
Section: A Learning By Demonstrationmentioning
confidence: 99%