2021
DOI: 10.48550/arxiv.2106.12097
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Regret-optimal Estimation and Control

Abstract: We consider estimation and control in linear time-varying dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing causal estimators and controllers which compete against a clairvoyant noncausal policy, instead of the best policy selected in hindsight from some fixed parametric class. We show that the regret-optimal estimator and regret-optimal controller can be derived in state-space form using operator-theoretic techniques from r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
16
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4

Relationship

3
1

Authors

Journals

citations
Cited by 4 publications
(16 citation statements)
references
References 9 publications
0
16
0
Order By: Relevance
“…Dynamic regret is a very similar metric to competitive ratio, which we consider in this paper, except that it is the difference between the cost of the online and offline controllers, rather than the ratio of the costs. The problem of designing controllers with optimal dynamic regret was studied in the finite-horizon, timevarying setting in [8], in the infinite-horizon LTI setting in [17], and in the measurement-feedback setting in [9]. Gradientbased algorithms with low dynamic regret against the class of disturbance-action policies were obtained in [12], [19].…”
Section: B Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Dynamic regret is a very similar metric to competitive ratio, which we consider in this paper, except that it is the difference between the cost of the online and offline controllers, rather than the ratio of the costs. The problem of designing controllers with optimal dynamic regret was studied in the finite-horizon, timevarying setting in [8], in the infinite-horizon LTI setting in [17], and in the measurement-feedback setting in [9]. Gradientbased algorithms with low dynamic regret against the class of disturbance-action policies were obtained in [12], [19].…”
Section: B Related Workmentioning
confidence: 99%
“…A controller whose competitive ratio is bounded above by C offers the following guarantee: the cost it incurs is always at most a factor of C higher than the cost that could have been counterfactually incurred by any other controller, irrespective of the disturbance is generated. Competitive ratio is a multiplicative analog of dynamic regret; the problem of obtaining controllers with optimal dynamic regret was recently considered in [8], [9], [17].…”
Section: Introductionmentioning
confidence: 99%
“…The problem of designing controllers with optimal dynamic regret was studied in the finite-horizon, timevarying setting in [5], in the infinite-horizon LTI setting in [12], and in the measurement-feedback setting in [6]. These works all bounded regret by the energy in the disturbances; the pathlength regret bounds we obtain in this paper also imply energy regret bounds which are optimal up to a factor of 4.…”
Section: Related Workmentioning
confidence: 81%
“…These works all bounded regret by the energy in the disturbances; the pathlength regret bounds we obtain in this paper also imply energy regret bounds which are optimal up to a factor of 4. Filtering algorithms with energy regret bounds were obtained in the finite-horizon setting in [5] and the infinite-horizon setting in [12]. Gradient-based control algorithms with low dynamic regret against the class of disturbance-action policies were obtained in [15]; the stronger metric of adaptive regret was studied in [8].…”
Section: Related Workmentioning
confidence: 99%
“…Here, the benchmark, which algorithms are compared to, no longer restricts the control actions to lie in a policy class, but is achieved through the optimal non-causal control sequence. For linear dynamics and quadratic costs, it was shown in [3] for the finite horizon case and in [4] for the infinite horizon case, that the optimal dynamic regret is proportional to the disturbance energy which the system experiences. The derived regret optimal controller adapts to the experienced disturbances and can thus outperform the H 2 -and H ∞ -controller.…”
Section: Introductionmentioning
confidence: 99%