2021
DOI: 10.48550/arxiv.2110.12544
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Online estimation and control with optimal pathlength regret

Abstract: A natural goal when designing online learning algorithms for non-stationary environments is to bound the regret of the algorithm in terms of the temporal variation of the input sequence. Intuitively, when the variation is small, it should be easier for the algorithm to achieve low regret, since past observations are predictive of future inputs. Such data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits. We obtain the firs… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 8 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?