2019
DOI: 10.48550/arxiv.1902.07286
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Online Learning with Continuous Variations: Dynamic Regret and Reductions

Abstract: Online learning is a powerful tool for analyzing iterative algorithms. However, the classic adversarial setup sometimes fails to capture certain regularity in online problems in practice. Motivated by this, we establish a new setup, called Continuous Online Learning (COL), where the gradient of online loss function changes continuously across rounds with respect to the learner's decisions. We show that COL covers and more appropriately describes many interesting applications, from general equilibrium problems … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2019
2019
2020
2020

Publication Types

Select...
3

Relationship

3
0

Authors

Journals

citations
Cited by 3 publications
(12 citation statements)
references
References 21 publications
0
12
0
Order By: Relevance
“…In other words, Theorem 1 states that, based on the identification Φ(x, x ′ ) = f x (x ′ ) − f x (x) and F (x) = ∇f x (x), achieving sublinear dynamic regret is essentially equivalent to finding an equilibrium x ⋆ ∈ X ⋆ , in which X ⋆ denotes the set of solutions of the EP and VI (one can show these two solution sets coincide [7]). Therefore, a necessary condition for sublinear dynamic regret is that X ⋆ is non-empty, which is true when ∇f x (x) is continuous in x and X is compact [16].…”
Section: Equivalence and Hardness Of Continuous Online Learningmentioning
confidence: 99%
See 2 more Smart Citations
“…In other words, Theorem 1 states that, based on the identification Φ(x, x ′ ) = f x (x ′ ) − f x (x) and F (x) = ∇f x (x), achieving sublinear dynamic regret is essentially equivalent to finding an equilibrium x ⋆ ∈ X ⋆ , in which X ⋆ denotes the set of solutions of the EP and VI (one can show these two solution sets coincide [7]). Therefore, a necessary condition for sublinear dynamic regret is that X ⋆ is non-empty, which is true when ∇f x (x) is continuous in x and X is compact [16].…”
Section: Equivalence and Hardness Of Continuous Online Learningmentioning
confidence: 99%
“…The proof leverages the reduction to static regret in Corollary 1. It is immediate from the fact that the online IL problem is (α, β)-regular (see Proposition 9 in the full technical report [7] for details). The dynamic regret is worse than that of the deterministic case, but it is still sublinear.…”
Section: Application To Online Imitation Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Because the online losses in OPO are designed to satisfy the performance relationship with respect to the given sequential decision making problem, the resulting online learning problem has a mixture of different properties, such as predictability, continuity, and stochasticity [11]. The interactions of these properties make the classic adversary-style online learning analysis taken by Ross et al [2] overly conservative, creating a mismatch between provable theoretical guarantees and the learning phenomena observed in practice.…”
Section: Introductionmentioning
confidence: 99%
“…This article is a significantly revised and extended version of our conference publication at the Workshop on the Algorithmic Foundations of Robotics (Lee et al, 2018a,b). In particular, this article (1) presents new theoretical results, greatly extending the formalization of dynamic regret as a metric in imitation learning; (2) provides detailed examples and analysis of well known systems that satisfy the continuity condition required in the theory; (3) explores connections with our subsequent work in Continuous Online Learning (Cheng et al, 2019a) and the variational inequality problem; (4) presents new experimental results.…”
Section: Introductionmentioning
confidence: 99%