2018
DOI: 10.48550/arxiv.1807.03545
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption

Abstract: The minimization of convex objectives coming from linear supervised learning problems, such as penalized generalized linear models, can be formulated as finite sums of convex functions. For such problems, a large set of stochastic first-order solvers based on the idea of variance reduction are available and combine both computational efficiency and sound theoretical guarantees (linear convergence rates) [19], [35], [36], [13]. Such rates are obtained under both gradient-Lipschitz and strong convexity assumptio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…This is obviously detrimental since, in practice, time sequences are increasingly abundant and large. On the other hand, improving the computational effectiveness of estimation procedures for Hawkes processes is a current direction of research (Bompaire et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…This is obviously detrimental since, in practice, time sequences are increasingly abundant and large. On the other hand, improving the computational effectiveness of estimation procedures for Hawkes processes is a current direction of research (Bompaire et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…Maximum likelihood estimation A different paradigm consists in maximising the log-likelihood of the sample path, see Daley and Vere-Jones [14]. To the best of our knowledge, the fastest parametric approach, in the case where the kernels are a sum of exponentials with fixed decay rates, is that in Bompaire, Bacry, and Gaïffas [9]; we use this algorithm as a parametric baseline in our numerical examples. Lemonnier and Vayatis [27] use Bernstein polynomials to give a density argument to justify the choice of a linear combination of exponential decays.…”
Section: Mhp Estimation Methodsmentioning
confidence: 99%
“…• SumExp: an SBF exponential MHP model, fitted using the algorithm in Bompaire et al [9]. This is an interesting benchmark to evaluate the quality of the fit for the decay parameter β in a non SBF exponential using our method.…”
Section: Numerical Experimentsmentioning
confidence: 99%
“…Such a function is then used to design an algorithm which is closely related to BPG. Since then, many other extensions of BPG have been developed to deal with the issue of lack of Lipschitz gradient [5,8,10,12,13,16,23,25]. For example, in [12,13] an inertial variant of BPG has been proposed, which relies on a Nesterov's momentum type update.…”
Section: Introductionmentioning
confidence: 99%