2021
DOI: 10.48550/arxiv.2104.13617
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

End-to-End Intersection Handling using Multi-Agent Deep Reinforcement Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…Instead of making simple assumptions, another way to consider traffic rules is to explicitly incorporate information into the state space or reward function of the DRL agent. In [13], stop lines and yield lines are represented in the state space using a grid map with different colors. The positions of other vehicles and their priority levels are also embedded within the state space.…”
Section: Ego Vehiclementioning
confidence: 99%
“…Instead of making simple assumptions, another way to consider traffic rules is to explicitly incorporate information into the state space or reward function of the DRL agent. In [13], stop lines and yield lines are represented in the state space using a grid map with different colors. The positions of other vehicles and their priority levels are also embedded within the state space.…”
Section: Ego Vehiclementioning
confidence: 99%
“…4) Deep learning: Deep learning methods involving reinforcement learning [18] and recurrent neural networks [19] learn a planning and control policy, to be used by agents approaching an intersection. In practice, these policies do not generalize well to different environments and often do not provide guarantees in terms of safety and fairness.…”
Section: A Related Workmentioning
confidence: 99%
“…We evaluate our method by comparing it with state-ofthe-art planning methods [30], [23], [38], [25], [19], [2] for unsignaled and uncontrolled environments and show a maximum reduction in the number of collisions and deadlocks by up to 30%. Additionally, we compare our algorithm with an ablated version that does not use turn-based orderings and show that the time taken for all agents to navigate the scenarios increases in the latter case.…”
Section: A Main Contributionsmentioning
confidence: 99%
“…In Table I, we compare our approach with the current stateof-the-art in navigating unsignaled intersections, roundabouts, and merging scenarios on the basis of optimality guarantees, multi-agent versus single-agent planning (MAP), description of action space (AS), incentive compatibility (IC), and realworld applicability. DRL-based methods [2], [19], [20], [25], [26] learn a navigation policy using the notion of expected reward received by an agent from taking a particular action in a particular state. This policy is learned from trajectories obtained via traffic simulators using Q-learning [27] and is very hard as well as expensive to train.…”
Section: Prior Workmentioning
confidence: 99%
See 1 more Smart Citation