2021 China Automation Congress (CAC) 2021
DOI: 10.1109/cac53003.2021.9728707
|View full text |Cite
|
Sign up to set email alerts
|

Research on Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Improved Dueling DQN Algorithm

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 11 publications
0
2
0
Order By: Relevance
“…Dueling DQN is widely considered as a signifcant improvement to conventional DQN. Diferent from the natural DQN, dueling DQN divides the Q-network into two parts, action advantage function with independent of state A(s t , a t ; ω, ξ) and state-value function V(s t ; ω, θ), which are calculated separately [20,21]. It is easy to fnd which action has better feedback by learning A(s t , a t ; ω, ξ).…”
Section: Problem Solutionmentioning
confidence: 99%
“…Dueling DQN is widely considered as a signifcant improvement to conventional DQN. Diferent from the natural DQN, dueling DQN divides the Q-network into two parts, action advantage function with independent of state A(s t , a t ; ω, ξ) and state-value function V(s t ; ω, θ), which are calculated separately [20,21]. It is easy to fnd which action has better feedback by learning A(s t , a t ; ω, ξ).…”
Section: Problem Solutionmentioning
confidence: 99%
“…Compared to the classical reinforcement learning models, the A2C algorithm has a more powerful learning ability, since it includes two neural networks: Actor and Critic [30,31], and A2C supports synchronous parallel sampling training, on the one hand, to ensure diversity of data, on the other hand, to improve learning efficiency [32]. Moreover, compared to the Q-learning [33,34] or DQN [35,36] algorithm, it is more suitable for continuous space problems, which is applicable to the vehicle swarm control problem in this study [37,38]. To reflect the cooperative obstacle-avoidance behaviour of the automated vehicle swarm, the optimization target in the A2C algorithm not only incorporates the safety and efficiency of an individual vehicle but also considers the efficiency of the vehicle swarm.…”
Section: Introductionmentioning
confidence: 99%