2021
DOI: 10.48550/arxiv.2106.04561
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Safe Deep Q-Network for Autonomous Vehicles at Unsignalized Intersection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…In this section, experiments were conducted to verify the effectiveness of the proposed driving behavior decision-making method, compared with DQN [14], the combination of DQN and priority experience replay method (Prioritized-DQN) [28], DDQN [36], and D3QN [35]. Firstly, the environment and parameter settings are described.…”
Section: Simulation Results and Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In this section, experiments were conducted to verify the effectiveness of the proposed driving behavior decision-making method, compared with DQN [14], the combination of DQN and priority experience replay method (Prioritized-DQN) [28], DDQN [36], and D3QN [35]. Firstly, the environment and parameter settings are described.…”
Section: Simulation Results and Discussionmentioning
confidence: 99%
“…Then, the module is integrated into the dueling double deep Q network (D3QN) to make safer and more efficient decisions for autonomous driving [35]. Mokhtari et al [36] utilized two long-term short-term memory (LSTM) models based on a double deep Q network (DDQN) and the priority experience replay method to reconstruct the perception state of the environment and the future trajectories of pedestrians.…”
Section: Related Workmentioning
confidence: 99%
“…Lastly, DDQN was used in [133] to explore the task of navigating unsignalized intersections crowded with pedestrians. A belief update is employed to obtain the perceived state of the environment and the future trajectories of pedestrians given the noisy observations.…”
Section: Intersection Assistance Systemsmentioning
confidence: 99%
“…In contrast, the learning-based method is more flexible in execution. The dominant method in learning-based AIM is multi-agent reinforcement learning (MARL), including Deep Q Learning (DQN) [39][40][41][42][43], Dueling Deep Q Learning (DDQN) [44], Deep Deterministic Policy Gradient (DDPG) [45], Proximal Policy Optimization (PPO) [46][47][48], Twin Delayed Deep Deterministic Policy Gradient (TD3) [49][50][51], and Soft Actor-Critic (SAC) [52]. Besides, DCL-AIM introduced coordinate state and independent state for CAVs to react in different scenarios [53], RAIM [54] and adv.RAIM [55] applied encoder-decoder structure with LSTM cell, AIM5LA further considered communication delay based on the adv.RAIM [56], and game theory was utilized to determine the leader-follower to enhance the performance of reinforcement learning in [52,57].…”
Section: Literature Reviewmentioning
confidence: 99%