2021
DOI: 10.48550/arxiv.2103.15598
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An Accelerated Method For Decentralized Distributed Stochastic Optimization Over Time-Varying Graphs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
7
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…For saddle-point problems, by replacing in the Smoothing scheme the batched-consensus Accelerated gradient method [56], which is optimal for decentralized convex problems, with the batched-consensus Extragradient method [7], which is optimal for decentralized convex-concave saddle-point problems, we loose ∼ √ d-factor in the number of communication rounds in comparison with optimal gradientfree methods for non-smooth decentralized saddle-point problems. To sum up, in distributed optimization, for the first time, we have a situation where the Smoothing scheme generates a non-optimal method from an optimal one.…”
Section: Distributed Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…For saddle-point problems, by replacing in the Smoothing scheme the batched-consensus Accelerated gradient method [56], which is optimal for decentralized convex problems, with the batched-consensus Extragradient method [7], which is optimal for decentralized convex-concave saddle-point problems, we loose ∼ √ d-factor in the number of communication rounds in comparison with optimal gradientfree methods for non-smooth decentralized saddle-point problems. To sum up, in distributed optimization, for the first time, we have a situation where the Smoothing scheme generates a non-optimal method from an optimal one.…”
Section: Distributed Optimizationmentioning
confidence: 99%
“…surveys [34,17]. In particular, there exists a batched-consensus-projected Accelerated gradient method [56] that, for µ-strongly convex in 2-norm f from (11) with L-Lipschitz gradient in 2-norm, requires…”
Section: Distributed Optimizationmentioning
confidence: 99%
“…It was shown in [46] that to obtain ǫ-optimal solutions, the gradient computation complexity is lower bounded by O √ κ log 1 ǫ , and the communication complexity is lower bounded by O κ θ log 1 ǫ . To obtain better complexities, many accelerated decentralized gradient-type methods have been developed (e.g., [11,12,16,18,20,21,22,23,24,38,41,42,43,46,54,57,61,62]). There exist dual-based methods such as [46] that achieve optimal complexities.…”
mentioning
confidence: 99%
“…In this paper, we focus on dual-free methods or gradient-type methods only. Some algorithms, for instance [16,22,23,41,42,61,62], rely on inner loops to guarantee desirable convergence rates. However, inner loops place a larger communication burden [24,38] which may limit the applications of these methods, since communication has often been recognized as the major bottleneck in distributed or decentralized optimization.…”
mentioning
confidence: 99%
See 1 more Smart Citation