2018
DOI: 10.1049/trit.2018.1007
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive PID controller based on Q ‐learning algorithm

Abstract: An adaptive proportional-integral-derivative (PID) controller based on Q-learning algorithm is proposed to balance the cart-pole system in simulation environment. This controller was trained using Q-learning algorithm and implemented the learned Q-tables to change the gains of linear PID controllers according to the state of the system during the control process. The adaptive PID controller based on Q-learning algorithm was trained from a set of fixed initial positions and was able to balance the system starti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 41 publications
(20 citation statements)
references
References 27 publications
0
20
0
Order By: Relevance
“…The Q-learning algorithm is an offline rule of reinforcement learning (RL) [48], [49]. It approximates and updates the current rule with the optimal action-value (Q * ) based on the action-value (Q).…”
Section: B Deterministic Q-slp Algorithm With a Stable Learning Ratementioning
confidence: 99%
“…The Q-learning algorithm is an offline rule of reinforcement learning (RL) [48], [49]. It approximates and updates the current rule with the optimal action-value (Q * ) based on the action-value (Q).…”
Section: B Deterministic Q-slp Algorithm With a Stable Learning Ratementioning
confidence: 99%
“…Reinforcement Learning is an approach aimed at solving problems such as control systems [18], energy management systems [19][20][21] and is one of the methods of machine learning. The essence of learning influenced this approach because only by communicating with the environment can the control policy produce without understanding the underlying system model.…”
Section: Reinforcement Learningmentioning
confidence: 99%
“…The purpose of the agent is to extract the optimum control strategy to optimize the discounted accumulated rewards, called as expected discounted return G t in the long term, the governing equation of which is given in Ref. [18].…”
Section: Reinforcement Learningmentioning
confidence: 99%
“…Therefore, it is interesting to establish a hybrid algorithm that combines the intelligent DRL (for example, the aforementioned dueling DQN) algorithm and a traditional PID controller, in order to take advantage of DRL's self-learning capability to tune a PID performance online. Unlike the practice proposed by some literatures [16,30], which uses the reinforcement learning approach to adjust the gains of PID controllers, in this paper, a simpler but more powerful method will be introduced by adding a dueling DQN algorithm directly after a fine-tuned PID controller, as can be seen in Figure 7. There are two special modifications that need to be considered here.…”
Section: Dueling Deep Q-network Architecturementioning
confidence: 99%