2020
DOI: 10.1109/access.2020.3046284
|View full text |Cite
|
Sign up to set email alerts
|

Autonomous Control of Combat Unmanned Aerial Vehicles to Evade Surface-to-Air Missiles Using Deep Reinforcement Learning

Abstract: This paper proposes a new reinforcement learning approach for executing combat unmanned aerial vehicle (CUAV) missions. We consider missions with the following goals: guided missile avoidance, shortestpath flight and formation flight. For reinforcement learning, the representation of the current agent state is important. We propose a novel method of using the coordinates and angle of a CUAV to effectively represent its state. Furthermore, we develop a reinforcement learning algorithm with enhanced exploration … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(19 citation statements)
references
References 18 publications
0
19
0
Order By: Relevance
“…• Flock Centering: maintaining flight formation as suggested by Reynolds [95] involves three concepts: 1) flock centering, 2) avoiding obstacles, and 3) velocity matching. This topology was applied in several research papers [82,[96][97][98][99]]. • Leader-Follower Flocking: the flock leader has its mission of reaching destination, while the followers (other UAVs) flock with the leader with a mission of maintaining distance and relative position to the leader [100,101].…”
Section: Flockingmentioning
confidence: 99%
“…• Flock Centering: maintaining flight formation as suggested by Reynolds [95] involves three concepts: 1) flock centering, 2) avoiding obstacles, and 3) velocity matching. This topology was applied in several research papers [82,[96][97][98][99]]. • Leader-Follower Flocking: the flock leader has its mission of reaching destination, while the followers (other UAVs) flock with the leader with a mission of maintaining distance and relative position to the leader [100,101].…”
Section: Flockingmentioning
confidence: 99%
“…The final goal is for the agent to move from the starting point to the target point. I represented the coordinates of the agent as a state using an effective coordinate vector [33]. The action of the agent was set as a simple movement: left, right, up, and down.…”
Section: Environmentmentioning
confidence: 99%
“…Combining two optimization methods also has been studied [39]. Recently, with the development of deep learning, studies on the path planning using the RL have mainly been proposed [3], [6], [7], [9], [10], [11], [14], [15], [16], [17], [40], [41], [42]. They have supposed the specific scenario and set an environment to apply the agent in the path planning.…”
Section: Path Planningmentioning
confidence: 99%
“…P ATH planning is a method to find an optimal route from the starting point to the target point. It has been widely used in various fields such as robotics [1], [2], [3], drone [4], [5], [6], [7], [8], [9], military service [10], [11], and self-driving car [12], [13]. Recently, reinforcement learning (RL) has been mainly studied for the path planning [3], [7], [9], [10], [11], [14], [15], [16], [17].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation