Data-based reinforcement learning approximate optimal control for an uncertain nonlinear system with control effectiveness faults

Deptula, Patryk; Bell, Zachary I.; Doucette, Emily A.; Curtis, J. Willard; Dixon, Warren E.

doi:10.1016/j.automatica.2020.108922

Cited by 30 publications

(18 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In contrast to (20), the version of the BE in ( 22) is a function of ∇B and therefore selecting the weight estimates to minimize (22) may not correspond to the minimization of the original BE in (20). That is, even if ( Ŵc , Ŵa ) → W , the BE in ( 22) may be large at certain points in the statespace because of the influence of the safeguarding component of ( 21), making ( 22) a non-ideal performance metric for learning.…”

Section: Safe Exploration Via Simulation Of Experiencementioning

confidence: 90%

“…However, since a model of the system is known (or, as discussed in Sec. VI, an approximation of the model is known), the BE can be evaluated at any point in the statespace [28] using a different policy to generate data more representative of (20). To facilitate this approach, define the family of mappings {x i : χ × R ≥t0 → χ} N i=1 such that each x i (x(t), t) ∈ B l (x(t)) maps the current state x(t) to some unexplored point in B l (x(t)).…”

Section: Safe Exploration Via Simulation Of Experiencementioning

confidence: 99%

“…(23) Note that the exploratory policy u i is not augmented with the safeguarding controller, thereby allowing for maximum exploration of the state-space by the extrapolated system trajectories without risking safety violation of the original system. Consequently, the BE in ( 23) is representative of the original BE from (20) and is used to update the weight estimates using a recursive least squares update law as 5 [20]…”

Section: Safe Exploration Via Simulation Of Experiencementioning

confidence: 99%

“…The proof follows the same steps as that of Theorem 2 and is omitted. See [28], [20] for proofs that establish stability of a similar ADP scheme in the presence of model uncertainty.…”

Section: Incorporating Uncertain Systemsmentioning

confidence: 99%

“…One learning-based control method that has garnered significant attention over the past decade is ADP, which relies on ideas from RL and tools from adaptive control [14] to solve optimal control problems online for potentially uncertain nonlinear systems. Although a variety of such methods have been proposed over the past decade [15], [16], [17], the majority of traditional approaches [18] are limited to unconstrained optimal control problems or those with input constraints [19], [20]. More recently, BFs and ADP have been merged in [21], [22], [23], [24] as a means of bridging the gap between learning-based control and safety-critical control.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions

Cohen¹,

Belta²

2021

Preprint

View full text Add to dashboard Cite

This paper studies the problem of developing an approximate dynamic programming (ADP) framework for learning online the value function of an infinite-horizon optimal problem while obeying safety constraints expressed as control barrier functions (CBFs). Our approach is facilitated by the development of a novel class of CBFs, termed Lyapunov-like CBFs (LCBFs), that retain the beneficial properties of CBFs for developing minimally-invasive safe control policies while also possessing desirable Lyapunov-like qualities such as positive semi-definiteness. We show how these LCBFs can be used to augment a learning-based control policy so as to guarantee safety and then leverage this approach to develop a safe exploration framework in a model-based reinforcement learning setting. We demonstrate that our developed approach can handle more general safety constraints than state-of-the-art safe ADP methods through a variety of numerical examples.

show abstract

Section: Safe Exploration Via Simulation Of Experiencementioning

confidence: 90%

Section: Safe Exploration Via Simulation Of Experiencementioning

confidence: 99%

Section: Safe Exploration Via Simulation Of Experiencementioning

confidence: 99%

“…The proof follows the same steps as that of Theorem 2 and is omitted. See [28], [20] for proofs that establish stability of a similar ADP scheme in the presence of model uncertainty.…”

Section: Incorporating Uncertain Systemsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions

Cohen¹,

Belta²

2021

Preprint

View full text Add to dashboard Cite

show abstract

Interference compensation discrete control based on memory data for a class of nonlinear systems

Wang

2022

Optim Control Appl Methods

View full text Add to dashboard Cite

In this article, in order to solve the interference compensation control problem of a class of nonlinear systems, we propose a method based on memory data to suppress interference greatly. Firstly, the continuous time model of nonlinear system is discretized to obtain the discrete model of the system, then, which is simplified by zero‐order hold. On this basis, the interference at the previous moment is calculated by using the system states measured and stored at the previous moment, which is added to the systems input to compensate for the interference at the current moment. Finally, the simulation results show that our method has a good inhibitory effect on interference, and the attenuation rate of interference is more than 98%.

show abstract

Safe Exploration in Model-Based Reinforcement Learning

Cohen

Belta

2023

Synthesis Lectures on Computer Science

View full text Add to dashboard Cite

Data-based reinforcement learning approximate optimal control for an uncertain nonlinear system with control effectiveness faults

Cited by 30 publications

References 20 publications

Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions

Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions

Interference compensation discrete control based on memory data for a class of nonlinear systems

Safe Exploration in Model-Based Reinforcement Learning

Contact Info

Product

Resources

About