This paper proposes a distributionally robust approach to regret optimal control of discrete-time linear dynamical systems with quadratic costs subject to stochastic additive disturbance on the state process. The underlying probability distribution of the disturbance process is unknown, but assumed to lie in a given ball of distributions defined in terms of the type-2 Wasserstein distance. In this framework, strictly causal linear disturbance feedback controllers are designed to minimize the worst-case expected regret. The regret incurred by a controller is defined as the difference between the cost it incurs in response to a realization of the disturbance process and the cost incurred by the optimal noncausal controller which has perfect knowledge of the disturbance process realization at the outset. Building on a well-established duality theory for optimal transport problems, we show how to equivalently reformulate this minimax regret optimal control problem as a tractable semidefinite program. The equivalent dual reformulation also allows us to characterize a worst-case distribution achieving the worst-case expected regret in relation to the distribution at the center of the Wasserstein ball.
We consider continuous-time stochastic optimal control problems featuring Conditional Valueat-Risk (CVaR) in the objective. The major difficulty in these problems arises from timeinconsistency, which prevents us from directly using dynamic programming. To resolve this challenge, we convert to an equivalent bilevel optimization problem in which the inner optimization problem is standard stochastic control. Furthermore, we provide conditions under which the outer objective function is convex and differentiable. We compute the outer objective's value via a Hamilton-Jacobi-Bellman equation and its gradient via the viscosity solution of a linear parabolic equation, which allows us to perform gradient descent. The significance of this result is that we provide an efficient dynamic programming-based algorithm for optimal control of CVaR without lifting the state-space. To broaden the applicability of the proposed algorithm, we propose convergent approximation schemes in cases where our key assumptions do not hold and characterize relevant suboptimality bounds. In addition, we extend our method to a more general class of risk metrics, which includes mean-variance and median-deviation. We also demonstrate a concrete application to portfolio optimization under CVaR constraints. Our results contribute an efficient framework for solving time-inconsistent CVaR-based sequential optimization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.