2020
DOI: 10.1609/aaai.v34i04.6116
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Projection-Free Online Methods with Stochastic Recursive Gradient

Abstract: This paper focuses on projection-free methods for solving smooth Online Convex Optimization (OCO) problems. Existing projection-free methods either achieve suboptimal regret bounds or have high per-round computational costs. To fill this gap, two efficient projection-free online methods called ORGFW and MORGFW are proposed for solving stochastic and adversarial OCO problems, respectively. By employing a recursive gradient estimator, our methods achieve optimal regret bounds (up to a logarithmic factor) while p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(19 citation statements)
references
References 13 publications
0
19
0
Order By: Relevance
“…We first introduce the setup of the communication graph for the multi-agent network. More specifically, we consider that 𝑁 agents communicate in a random network [23,33], where each two agents are linked with probability 0.5 (discard the graphs that are not connected). And the weight matrix 𝑊 is defined based on the Metropolis rule [26]:…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We first introduce the setup of the communication graph for the multi-agent network. More specifically, we consider that 𝑁 agents communicate in a random network [23,33], where each two agents are linked with probability 0.5 (discard the graphs that are not connected). And the weight matrix 𝑊 is defined based on the Metropolis rule [26]:…”
Section: Methodsmentioning
confidence: 99%
“…To tackle this challenge and also accelerate online meta-learning, we take a closer look to the distributed networklevel OCO in this section, and devise a distributed OGD algorithm with gradient tracking. For ease of exposition, we consider a more general formulation [3,5,12,33] for the distributed network-level OCO (3): In iteration 𝑡 the agent 𝑖 makes a local model prediction 𝑥 𝑡,𝑖 from a convex compact set K ⊂ R 𝑑 and incurs convex loss 𝑓 𝑡,𝑖 (𝑥 𝑡,𝑖 ) that follows some unknown distribution P, i.e., 𝑓 𝑡,𝑖 ∼ P, for any 𝑡 and 𝑖 ∈ N . The stochastic assumption about the loss function corresponds to the underlying task distribution P T of meta-learning in an implicit manner.…”
Section: Distributed Network-level Online Convex Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…It has been further improved in [4] by introducing the lazy conditional gradient. Xie et al [25] introduced the recursive variance reduction technique to reduce noise in stochastic gradients prediction. Although these methods have enjoyed success in reducing projection cost, they require either at least one projection or linear optimization both of which can be prohibitive in large scale learning problems.…”
Section: Related Workmentioning
confidence: 99%
“…Recently, several variants have also been developed to improve its performance [36,38,21,18,4,9]. When f is nonconvex, [35] showed that FW converges to a stationary point at a rate O(1/ √ t) in the gap max v∈C x t − v, ∇f (x t ) [28], which has inspired a line of work in stochastic optimization [48,52,51,10,53].…”
Section: Introductionmentioning
confidence: 99%