We exhibit optimal control strategies for a simple toy problem in which the underlying dynamics depend on a parameter that is initially unknown and must be learned. We consider a cost function posed over a finite time interval, in contrast to much previous work that considers asymptotics as the time horizon tends to infinity. We study several different versions of the problem, including Bayesian control, in which we assume a prior distribution on the unknown parameter; and "agnostic" control, in which we assume nothing about the unknown parameter. For the agnostic problems, we compare our performance with that of an opponent who knows the value of the parameter. This comparison gives rise to several notions of "regret", and we obtain strategies that minimize the "worst-case regret" arising from the most unfavorable choice of the unknown parameter. In every case, the optimal strategy turns out to be a Bayesian strategy or a limit of Bayesian strategies.
We investigate robust stability of the fully probabilistic control with respect to data-driven model uncertainties. This scheme attempts to control a system modeled via a probability density function (pdf) and does so by computing a probabilistic control policy that is optimal in the Kullback-Leibler sense. The results are illustrated via simulations.
We exhibit optimal control strategies for a simple toy problem in which the underlying dynamics depend on a parameter that is initially unknown and must be learned. We consider a cost function posed over a finite time interval, in contrast to much previous work that considers asymptotics as the time horizon tends to infinity. We study several different versions of the problem, including Bayesian control, in which we assume a prior distribution on the unknown parameter; and "agnostic" control, in which we assume nothing about the unknown parameter. For the agnostic problems, we compare our performance with that of an opponent who knows the value of the parameter. This comparison gives rise to several notions of "regret," and we obtain strategies that minimize the "worst-case regret" arising from the most unfavorable choice of the unknown parameter. In every case, the optimal strategy turns out to be a Bayesian strategy or a limit of Bayesian strategies.
Algorithm: C m Selection Algorithm Data: Real numbers M > 0, τ ∈ (0, 1), an N−element set E ∈ R n and a convex polytope K(x) ⊂ R D for each x ∈ E /* We suppose that each K(x) is specified by at most C linear constraints */ Result: One of the following two outcomes:for each x ∈ E. Moreover, we guarantee that there existsNo go: We guarantee that there exists noAlgorithm 2: C m selection algorithm description.The first step is to place the problem in a wider context. Instead of merely examining the values of F at points x ∈ E, we consider the (m − 1)-rst degree Taylor polynomial of F at x, which we denote by J x (F). We write P to denote the vector space of all such Taylor polynomials. Instead of families of convex sets K(x) ⊂ R D , we consider families of convex sets Γ (x, M, τ) ⊂ P (x ∈ E, M > 0, τ ∈ (0, 1)). We want to find F ∈ C m (R n , R D ) with norm at most CM, such that J x (F) ∈ Γ (x, M, τ) for all x ∈ E.Under suitable assumptions on the Γ (x, M, τ), we provide the following algorithm.Algorithm: Generalized Selection Algorithm Data: Real numbers M > 0, τ ∈ (0, 1). A suitable family of convex sets Γ (x, M, τ). Result: One of the following two outcomes: Success: We exhibit a polynomial P x ∈ Γ (x, CM, Cτ) for each x ∈ E. Moreover, we guarantee that there existsNo go: We guarantee that there exists noThe algorithm requires at most C(τ)N log N operations. Our previous C m selection algorithm is a special case of the Generalized selection algorithm.Once we are dealing with Γ 's, we can take D = 1 without loss of generality,
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.