Long-Run Risk-Sensitive Impulse Control

Jelito, Damian; Pitera, Marcin; Stettner, Łukasz

doi:10.1137/19m1305355

Cited by 13 publications

(5 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where the second inequality follows from (29), the third inequality from log E[ f (s) + ḡ(s)] ≥ E( f (s)) + E(ḡ(s)), the fourth inequality from Jensen's inequality, and the last one from Assumption 1. By changing the roles of Q π 1 k with Q π 2 k we obtain (18). This completes the proof.…”

Section: Appendix a Proof Of Proposition 1 Considermentioning

confidence: 53%

A Risk-Averse Preview-based $Q$-Learning Algorithm: Application to Highway Driving of Autonomous Vehicles

Mazouchi¹,

Nageshrao²,

Modares³

2021

Preprint

View full text Add to dashboard Cite

A risk-averse preview-based Q-learning planner is presented for navigation of autonomous vehicles. To this end, the multi-lane road ahead of a vehicle is represented by a finitestate non-stationary Markov decision process (MDP). A risk assessment unit module is then presented that leverages the preview information provided by sensors along with a stochastic reachability module to assign reward values to the MDP states and update them as scenarios develop. A sampling-based riskaverse preview-based Q-learning algorithm is finally developed that generates samples using the preview information and reward function to learn risk-averse optimal planning strategies without actual interaction with the environment. The risk factor is imposed on the objective function to avoid fluctuation of the Q values, which can jeopardize the vehicle's safety and/or performance. The overall hybrid automaton model of the system is leveraged to develop a feasibility check unit module that detects unfeasible plans and enables the planner system to proactively react to the changes of the environment. Theoretical results are provided to bound the number of samples required to guarantee -optimal planning with a high probability. Finally, to verify the efficiency of the presented algorithm, its implementation on highway driving of an autonomous vehicle in a varying traffic density is considered.

show abstract

Section: Appendix a Proof Of Proposition 1 Considermentioning

confidence: 53%

A Risk-Averse Preview-based $Q$-Learning Algorithm: Application to Highway Driving of Autonomous Vehicles

Mazouchi¹,

Nageshrao²,

Modares³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…This paper extends the results from [17], where the function G is assumed to be bounded. In that case, it can be shown that the Bellman equation admits a unique solution, which can be used to prove continuity of the function u ≡ w. This result was one of the main building blocks used in [16], where the long-run impulse control problem was analysed. In the present paper we show a more general sufficient condition for the identity u ≡ w. This may be used to generalise the results from [16] to the unbounded case.…”

Section: Introductionmentioning

confidence: 98%

Risk-sensitive optimal stopping with unbounded terminal cost function

Jelito

Stettner

2022

Electron. J. Probab.

Self Cite

View full text Add to dashboard Cite

In this paper we consider an infinite time horizon risk-sensitive optimal stopping problem for a Feller-Markov process with an unbounded terminal cost function. We show that in the unbounded case an associated Bellman equation may have multiple solutions and we give a probabilistic interpretation for the minimal and the maximal one. Also, we show how to approximate them using finite time horizon problems. The analysis, covering both discrete and continuous time case, is supported with illustrative examples.

show abstract

“…This paper extends the results from Jelito et al (2021), where the function G is assumed to be bounded. In that case, it can be shown that Bellman equation admits a unique solution, which can be used to prove continuity of the function u ≡ w. This result was one of the main building blocks used in Jelito et al (2020), where the long-run impulse control problem was analysed. In the present paper we show a more general sufficient condition for the identity u ≡ w. This may be used to generalise the results from Jelito et al (2020) to the unbounded case.…”

Section: Introductionmentioning

confidence: 98%