2022
DOI: 10.1016/j.compchemeng.2022.107658
|View full text |Cite
|
Sign up to set email alerts
|

Alleviating parameter-tuning burden in reinforcement learning for large-scale process control

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 24 publications
0
4
0
Order By: Relevance
“…Theoretical research has demonstrated that the relative entropy term implicitly averages the error of the approximated value function with state-of-theart error dependency [18], [19]. In terms of applications, this characteristic also contributes to superior sample efficiency and learning capability in a wide range of engineering tasks from robot control [20], [21] to chemical platform optimization [22], [23] where the agents efficiently explore the target tasks with a limited number of interactions using smoothly updated policies. However, despite the promising results in practical, the current works are mainly limited in tasks with discrete actions while directly extending the relative entropy regularization to the DDPG-like RL approaches with continuous action space is tricky: the relative entropy regularization in DPP requires traversal softmax operations over the entire discrete action space which is intractable to the continuous actions under the AC structure of DDPG.…”
Section: Cdpp (Ours)mentioning
confidence: 99%
“…Theoretical research has demonstrated that the relative entropy term implicitly averages the error of the approximated value function with state-of-theart error dependency [18], [19]. In terms of applications, this characteristic also contributes to superior sample efficiency and learning capability in a wide range of engineering tasks from robot control [20], [21] to chemical platform optimization [22], [23] where the agents efficiently explore the target tasks with a limited number of interactions using smoothly updated policies. However, despite the promising results in practical, the current works are mainly limited in tasks with discrete actions while directly extending the relative entropy regularization to the DDPG-like RL approaches with continuous action space is tricky: the relative entropy regularization in DPP requires traversal softmax operations over the entire discrete action space which is intractable to the continuous actions under the AC structure of DDPG.…”
Section: Cdpp (Ours)mentioning
confidence: 99%
“…Pan et al 300 present a reinforcement learning control approach that can handle nonlinear stochastic optimal control problems and has the potential of meeting state constraints. Some recent advances in reinforcement learning have to do with boosting the performance of such algorithms as discussed in Zhu et al 301 and with the leverage of reinforcement learning for the tuning of PID controllers as shown in Dogrua et al 302 In Schwung et al, 303 the reinforcement learning task is speed-up by deploying programmable logic controller information. Moreover, recent reinforcement control strategies aimed to batch control can be found elsewhere (Ma et al, 304 Kim et al, 305 Yoo et al, 306 Joshi et al, 307 and Mowbray et al 308 ).…”
Section: Reinforcement Learning Algorithmsmentioning
confidence: 99%
“…Such an RL framework regularized by KL divergence is theoretically proved to have the state-of-the-art error dependency as it implicitly averages over all previous action value functions and hence also averages errors according to [18], [19]. This characteristic contributed to the great data-efficiency in various challenging engineering applications from robot manipulation [20], [21] to chemical plant control [22], [23] where the agents quickly explored the task within limited number of interactions through smoothly updated policies.…”
Section: Approachmentioning
confidence: 99%