Deterministic Policy Gradient With Integral Compensator for Robust Quadrotor Control

et al. 2021

IEEE Access

Mobile robots contributed significantly to the intelligent development of human society, and the motion-planning policy is critical for mobile robots. This paper reviews the methods based on motionplanning policy, especially the ones involving Deep Reinforcement Learning (DRL) in the unstructured environment. The conventional methods of DRL are categorized to value-based, policy-based and actorcritic-based algorithms, and the corresponding theories and applications are surveyed. Furthermore, the recently-emerged methods of DRL are also surveyed, especially the ones involving the imitation learning, meta-learning and multi-robot systems. According to the surveys, the potential research directions of motion-planning algorithms serving for mobile robots are enlightened.

Section: Figure 8 Dqn Algorithm Network Architecturementioning

confidence: 99%

Section: Figure12 Motion Planning Principle Of Ddpgmentioning

confidence: 99%

Motion Planning for Mobile Robots—Focusing on Deep Reinforcement Learning: A Systematic Review

Sun

et al. 2021

IEEE Access

“…Adaptive dynamic programming (ADP) [1][2][3][4], which integrates the advantages of reinforcement learning (RL) [5][6][7][8] and adaptive control, has become a powerful tool in solving optimal control problems. With decades of development, ADP has also provided many approaches to solve other control problems, such as robust control [9,10], optimal control with input constraints [11,12], optimal tracking control [13,14], zero-sum games [15], and non-zero-sum games [16]. Furthermore, ADP methods have been widely applied to the real-world systems, such as water-gas shift reaction [17], battery management [18], microgrid systems [19,20], and Quanser helicopter [21].…”

Section: Introductionmentioning

confidence: 99%

Neural Network‐Based Intelligent Computing Algorithms for Discrete‐Time Optimal Control with the Application to a Cyberphysical Power System

Jiang

et al. 2021

Complexity

Adaptive dynamic programming (ADP), which belongs to the field of computational intelligence, is a powerful tool to address optimal control problems. To overcome the bottleneck of solving Hamilton–Jacobi–Bellman equations, several state-of-the-art ADP approaches are reviewed in this paper. First, two model-based offline iterative ADP methods including policy iteration (PI) and value iteration (VI) are given, and their respective advantages and shortcomings are discussed in detail. Second, the multistep heuristic dynamic programming (HDP) method is introduced, which avoids the requirement of initial admissible control and achieves fast convergence. This method successfully utilizes the advantages of PI and VI and overcomes their drawbacks at the same time. Finally, the discrete-time optimal control strategy is tested on a power system.

“…However, these linear approaches are sensitive to the nonlinearities, and the flight performance will be degraded when the disturbances occur. In order to improve the robust control performance, more efforts have been made on the nonlinear control approaches, such as backstepping control [17], sliding mode control (SMC) [18], active disturbance rejection control (ADRC) [19], fuzzy control and other intelligent control methods [20]- [22]. The SMC method is insensitive to the disturbances and is widely applied for the nonlinear systems [23], [24].…”

Section: Introductionmentioning

confidence: 99%

Robust and Adaptive Backstepping Control for Hexacopter UAVs

Deng

et al. 2019

IEEE Access

A nonlinear robust and adaptive backstepping control strategy is hierarchically proposed to solve the trajectory tracking problem of hexacopter UAVs. Due to the under-actuated and coupled properties of the hexacopter dynamics, the nominal backstepping control approach is fully designed as the main controller. Considering the model uncertainties and external disturbances perturbing the system stability, a robust 2 nd-order linear extended state observer (LESO) with more reliable velocity feedback is devised to observe and suppress the instabilities, and peaking phenomena in the observation are removed. Usually, large observer gains are selected to reduce the tracking errors but will amplify the measurement noise. To further enhance the system robustness, an adaptive switching function based compensator is introduced to eliminate the observation errors, through which the requirement on large observer gains is relaxed, and high gain behaviors of the LESO are avoided. Stability analysis proves that the nonlinear control scheme can ensure the hexacopter UAV asymptotic tracking along the designated trajectory. Comparative simulations under different controllers are carried out to demonstrate the efficiency and superiority of the proposed control scheme. INDEX TERMS Hexacopter UAV, trajectory tracking, robust backstepping control, 2 nd-order LESO, hierarchical compensators.