2020
DOI: 10.1109/tsmc.2018.2884725
|View full text |Cite
|
Sign up to set email alerts
|

Deterministic Policy Gradient With Integral Compensator for Robust Quadrotor Control

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
42
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
3

Relationship

2
7

Authors

Journals

citations
Cited by 129 publications
(48 citation statements)
references
References 32 publications
0
42
0
Order By: Relevance
“…DQN algorithm also uses other types of images as input.Based on the depth image information, Zhang et al [84] migrated the learned navigation strategy to the unknown environment by means of inheriting features, and realized the migration of robot navigation strategy from simulation to reality by DQN, as shown in FIGURE 9(b). In the 3D simulation environment, Barron T et al [85] used RGB image as the input of DQN and used deeper neural network to train the robot in complex tasks, which achieved better performance than the depth image and video.…”
Section: Figure 8 Dqn Algorithm Network Architecturementioning
confidence: 99%
See 1 more Smart Citation
“…DQN algorithm also uses other types of images as input.Based on the depth image information, Zhang et al [84] migrated the learned navigation strategy to the unknown environment by means of inheriting features, and realized the migration of robot navigation strategy from simulation to reality by DQN, as shown in FIGURE 9(b). In the 3D simulation environment, Barron T et al [85] used RGB image as the input of DQN and used deeper neural network to train the robot in complex tasks, which achieved better performance than the depth image and video.…”
Section: Figure 8 Dqn Algorithm Network Architecturementioning
confidence: 99%
“…The trained policy can be applied directly to both virtual and real environments. Wang et al [102] used the deterministic policy gradient to control the flight of the quadrotor helicopter. DDPG directly maps the system state to the control command, and introduces the integral compensator into the criticism structure, so that the tracking accuracy and robustness of the quadrotor helicopter are greatly improved.…”
Section: Figure12 Motion Planning Principle Of Ddpgmentioning
confidence: 99%
“…Adaptive dynamic programming (ADP) [1][2][3][4], which integrates the advantages of reinforcement learning (RL) [5][6][7][8] and adaptive control, has become a powerful tool in solving optimal control problems. With decades of development, ADP has also provided many approaches to solve other control problems, such as robust control [9,10], optimal control with input constraints [11,12], optimal tracking control [13,14], zero-sum games [15], and non-zero-sum games [16]. Furthermore, ADP methods have been widely applied to the real-world systems, such as water-gas shift reaction [17], battery management [18], microgrid systems [19,20], and Quanser helicopter [21].…”
Section: Introductionmentioning
confidence: 99%
“…However, these linear approaches are sensitive to the nonlinearities, and the flight performance will be degraded when the disturbances occur. In order to improve the robust control performance, more efforts have been made on the nonlinear control approaches, such as backstepping control [17], sliding mode control (SMC) [18], active disturbance rejection control (ADRC) [19], fuzzy control and other intelligent control methods [20]- [22]. The SMC method is insensitive to the disturbances and is widely applied for the nonlinear systems [23], [24].…”
Section: Introductionmentioning
confidence: 99%