Abstract:This paper presents the flight penetration path planning algorithm in a complex environment with Bogie or Bandit (BB) threats for stealth unmanned aerial vehicle (UAV). The emergence of rigorous air defense radar net necessitates efficient flight path planning and replanning for stealth UAV concerning survivability and penetration ability. We propose the improved A-Star algorithm based on the multiple step search approach to deal with this uprising problem. The objective is to achieve rapid penetration path pl… Show more
“…TUAV path optimization under radar tracking threat has been explored by a number of researchers in non-RL settings [33,21,36,17,35,12,11]. The methods proposed by these researchers include Nonlinear Trajectory Generation (NTG) algorithm [11], Label Setting Algorithm (LSA) [33], 𝐴 * algorithm [21,35], numerical optimization procedure for a minimax optimal control, with moving average functional [12], Among the non-RL based solutions the one presented in [12] is utilized in our work since it integrates the model of the aircraft, a probabilistic model of a radar, and a behavioral approximation of missile subsystems based on the decision process for launching a SAM and the requirement to maintain tracking during missile flyout.…”
Section: Background and Environment Modelingmentioning
confidence: 99%
“…For an aircraft to be destroyed at time 𝑡, the radar system must have tracked it during the continuous interval [𝑡 − (𝑇 𝑟 + 𝑇 𝑓 ), 𝑡]. If Δ𝑇 = 𝑇 𝑟 + 𝑇 𝑓 is defined to be the threat window, then probability of kill is defined as: The UAV is represented using a kinematic model given in [12] and also adopted in [36], which account for the coupling between the RCS and aircraft dynamics. To be more specific, the turn rate of the aircraft is determined by its bank angle which in turn is determined by a steering input represented by 𝑢.…”
Section: A Probabilistic Model Of Radar For Detection Tracking and De...mentioning
Tackling tactical UAV path planning under radar threat using reinforcement learning involves particular challenges ranging from modeling related difficulties to sparse feedback problem. Learning goal-directed behavior with sparse feedback from complex environments is a fundamental challenge for reinforcement learning algorithms. In this paper we extend our previous work in this area to provide a solution to the problem setting stated above, using Hierarchical Reinforcement Learning (HRL) in a novel way that involves a meta controller for higher level goal assignment and a controller that determines the lower-level actions of the agent. Our meta controller is based on a regression model trained using a state transition scheme that defines the evolution of goal designation, whereas our lower-level controller is based on a Deep Q Network (DQN) and is trained via reinforcement learning iterations. This two-layer framework ensures that an optimal plan for a complex path, organized as multiple goals, is achieved gradually, through piecewise assignment of sub-goals, and thus as a result of a staged, efficient and rigorous procedure.
“…TUAV path optimization under radar tracking threat has been explored by a number of researchers in non-RL settings [33,21,36,17,35,12,11]. The methods proposed by these researchers include Nonlinear Trajectory Generation (NTG) algorithm [11], Label Setting Algorithm (LSA) [33], 𝐴 * algorithm [21,35], numerical optimization procedure for a minimax optimal control, with moving average functional [12], Among the non-RL based solutions the one presented in [12] is utilized in our work since it integrates the model of the aircraft, a probabilistic model of a radar, and a behavioral approximation of missile subsystems based on the decision process for launching a SAM and the requirement to maintain tracking during missile flyout.…”
Section: Background and Environment Modelingmentioning
confidence: 99%
“…For an aircraft to be destroyed at time 𝑡, the radar system must have tracked it during the continuous interval [𝑡 − (𝑇 𝑟 + 𝑇 𝑓 ), 𝑡]. If Δ𝑇 = 𝑇 𝑟 + 𝑇 𝑓 is defined to be the threat window, then probability of kill is defined as: The UAV is represented using a kinematic model given in [12] and also adopted in [36], which account for the coupling between the RCS and aircraft dynamics. To be more specific, the turn rate of the aircraft is determined by its bank angle which in turn is determined by a steering input represented by 𝑢.…”
Section: A Probabilistic Model Of Radar For Detection Tracking and De...mentioning
Tackling tactical UAV path planning under radar threat using reinforcement learning involves particular challenges ranging from modeling related difficulties to sparse feedback problem. Learning goal-directed behavior with sparse feedback from complex environments is a fundamental challenge for reinforcement learning algorithms. In this paper we extend our previous work in this area to provide a solution to the problem setting stated above, using Hierarchical Reinforcement Learning (HRL) in a novel way that involves a meta controller for higher level goal assignment and a controller that determines the lower-level actions of the agent. Our meta controller is based on a regression model trained using a state transition scheme that defines the evolution of goal designation, whereas our lower-level controller is based on a Deep Q Network (DQN) and is trained via reinforcement learning iterations. This two-layer framework ensures that an optimal plan for a complex path, organized as multiple goals, is achieved gradually, through piecewise assignment of sub-goals, and thus as a result of a staged, efficient and rigorous procedure.
“…International Journal of Aerospace Engineering constraints [9]. Path planning algorithms are usually divided into global path planning algorithms and local path planning algorithms [10]. Among them, the global path planning algorithm requires that the environmental model is known, and the algorithm can generate the global optimal path according to the environmental constraints, and the representative one is the A * algorithm [11].…”
Unmanned helicopters (UH) can evade radar detection by flying at ultralow altitudes, so as to conduct raids on targets. Path planning is one of the key technologies to realize UH’s autonomous completion of raid missions. Since the probability of UH being detected by radar varies with height, how to accurately identify the radar coverage area to avoid crossing has become a difficult problem in UH path planning. Aiming at this problem, a heuristic deep Q-network (H-DQN) algorithm is proposed. First, as part of the comprehensive reward function, a heuristic reward function is designed. The function can generate dynamic rewards in real time according to the environmental information, so as to guide the UH to move closer to the target and at the same time promote the convergence of the algorithm. Second, in order to smooth the flight path, a smoothing reward function is proposed. This function can evaluate the pros and cons of UH’s actions, so as to prompt UH to choose a smoother path for flight. Finally, the heuristic reward function, the smooth reward function, the collision penalty, and the completion reward are weighted and summed to obtain the heuristic comprehensive reward function. Simulation experiments show that the H-DQN algorithm can help UH to effectively avoid the radar coverage area and successfully complete the raid mission.
“…[28][29][30] Therefore, a weighted factor ωðω > 1Þ was introduced into the heuristic function of the conventional A-Star algorithm to increase the algorithm search depth and ensure path optimization. 31 The heuristic function of the BML A-Star algorithm is given by f ðnÞ ¼ gðnÞ þ ωhðnÞ (13)…”
For stealth unmanned aerial vehicles (UAVs), path security and search efficiency of penetration paths are the two most important factors in performing missions. This article investigates an optimal penetration path planning method that simultaneously considers the principles of kinematics, the dynamic radar cross-section of stealth UAVs, and the network radar system. By introducing the radar threat estimation function and a 3D bidirectional sector multilayer variable step search strategy into the conventional A-Star algorithm, a modified A-Star algorithm was proposed which aims to satisfy waypoint accuracy and the algorithm searching efficiency. Next, using the proposed penetration path planning method, new waypoints were selected simultaneously which satisfy the attitude angle constraints and rank-K fusion criterion of the radar system. Furthermore, for comparative analysis of different algorithms, the conventional A-Star algorithm, bidirectional multilayer A-Star algorithm, and modified A-Star algorithm were utilized to settle the penetration path problem that UAVs experience under various threat scenarios. Finally, the simulation results indicate that the paths obtained by employing the modified algorithm have optimal path costs and higher safety in a 3D complex network radar environment, which show the effectiveness of the proposed path planning scheme.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.