Separated Proportional-Integral Lagrangian for Chance Constrained Reinforcement Learning

Peng, Baiyu; Mu, Yao; Duan, Jingliang; Guan, Yang; Li, Shengbo Eben; Chen, Jianyu

doi:10.48550/arxiv.2102.08539

Cited by 1 publication

(2 citation statements)

References 14 publications

(26 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, the framework is limited to linear systems with additive uncertainty. Recently, in [24] the authors present an approach to address high probability constraint satisfaction based on the augmented lagrangian. However, the penalty term presented does not provide information about the quality of control selection (i.e.…”

Section: Safe Reinforcement Learningmentioning

confidence: 99%

See 1 more Smart Citation

Safe Chance Constrained Reinforcement Learning for Batch Process Control

Mowbray¹,

Petsagkourakis²,

Chanona³

et al. 2021

Preprint

View full text Add to dashboard Cite

Reinforcement Learning (RL) controllers have generated excitement within the control community. The primary advantage of RL controllers relative to existing methods is their ability to optimize uncertain systems independently of explicit assumption of process uncertainty. Recent focus on engineering applications has been directed towards the development of safe RL controllers. Previous works have proposed approaches to account for constraint satisfaction through constraint tightening from the domain of stochastic model predictive control. Here, we extend these approaches to account for plant-model mismatch. Specifically, we propose a data-driven approach that utilizes Gaussian processes for the offline simulation model and use the associated posterior uncertainty prediction to account for joint chance constraints and plant-model mismatch. The method is benchmarked against nonlinear model predictive control via case studies. The results demonstrate the ability of the methodology to account for process uncertainty, enabling satisfaction of joint chance constraints even in the presence of plant-model mismatch.

show abstract

Section: Safe Reinforcement Learningmentioning

confidence: 99%

“…A number of RL-based methodologies have been proposed to ensure operational constraints are satisfied with high probability [26,25,24]. Other works have been proposed to consider the processmodel mismatch that exists when learning an RL policy offline [10,13,12].…”

Section: Contributionmentioning

confidence: 99%