A Theoretical and Empirical Comparison of Gradient Approximations in Derivative-Free Optimization

Berahas, Albert S.; Cao, Liyuan; Choromański, Krzysztof; Scheinberg, Katya

doi:10.48550/arxiv.1905.01332

Cited by 14 publications

(48 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This framework has lead to variants of methods such as stochastic gradient descent [34], SVRG [51] and Adam [22], for example. We note that linear interpolation to orthogonal directions-more similar to traditional model-based DFO-has been shown to outperform gradient sampling as a gradient estimation technique [8,7].…”

Section: Existing Literaturementioning

confidence: 91%

See 1 more Smart Citation

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

Cartis¹,

Roberts²

2021

Preprint

View full text Add to dashboard Cite

We introduce a general framework for large-scale model-based derivative-free optimization based on iterative minimization within random subspaces. We present a probabilistic worst-case complexity analysis for our method, where in particular we prove high-probability bounds on the number of iterations before a given optimality is achieved. This framework is specialized to nonlinear least-squares problems, with a model-based framework based on the Gauss-Newton method. This method achieves scalability by constructing local linear interpolation models to approximate the Jacobian, and computes new steps at each iteration in a subspace with user-determined dimension. We then describe a practical implementation of this framework, which we call DFBGN. We outline efficient techniques for selecting the interpolation points and search subspace, yielding an implementation that has a low per-iteration linear algebra cost (linear in the problem dimension) while also achieving fast objective decrease as measured by evaluations. Extensive numerical results demonstrate that DFBGN has improved scalability, yielding strong performance on large-scale nonlinear least-squares problems.

show abstract

Section: Existing Literaturementioning

confidence: 91%

“…This gives a total cost of O(p 3 + np). 8 Alternative Point Removal Mechanism Instead of Algorithm 4, we could have used a simpler mechanism for removing points, such as removing the points furthest from the current iterate (with total cost 9 O(np)). However, this leads to a substantial performance penalty.…”

Section: Geometry Managementmentioning

confidence: 99%

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

Cartis¹,

Roberts²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…That is far from what we've declared in section 1. To improve this it's worth to propose more detailed conception rather then (7).…”

Section: Ideas Behind the Resultsmentioning

confidence: 99%

“…particular, we propose an alternative approach to regularization § that is based on "early stopping" ¶ of considered iterative procedure by developing proper stopping rule. Now we explain how to reduce relative inexactness (3) to (7) and to apply (9) when µ ε. Since f (x) has Lipschitz gradient from (3), (7) we may derive that after k iterations (where k is greater than L/µ on a logarithmic factor log LR 2 /ε , where ε -accuracy in function value)…”

Section: Lmentioning

confidence: 99%

“…Note, that the last relation is a consequence of the previous ones and in general is not equivalent to them [49,25]. In many applications, especially for gradient-free methods (when estimating the gradient by finite differences [11,44,7]) optimization problems in infinite dimensional spaces (such examples arise when solving inverse problems [31,26]) instead of an access to ∇f (x) we have an access to its inexact approximation ∇f (x).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Accelerated gradient methods with absolute and relative noise in the gradient

Artem¹,

Gasnikov²,

Dvurechensky³

et al. 2021

Preprint

View full text Add to dashboard Cite

In this article, we investigate an accelerated first-order method, namely, the method of similar triangles, which is optimal in the class of convex (strongly convex) problems with a Lipschitz gradient. The paper considers a model of additive noise in a gradient and a Euclidean prox-structure for not necessarily bounded sets. Convergence estimates are obtained in the case of strong convexity and its absence, and a stopping criterion is proposed for not strongly convex problems.

show abstract

A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning: Principals, Recent Advances, and Applications

Liu

Chen

Kailkhura

et al. 2020

IEEE Signal Process. Mag.

113

View full text Add to dashboard Cite

Zeroth-order (ZO) optimization is a subset of gradient-free optimization that emerges in many signal processing and machine learning applications. It is used for solving optimization problems similarly to gradient-based methods. However, it does not require the gradient, using only function evaluations. Specifically, ZO optimization iteratively performs three major steps: gradient estimation, descent direction computation, and solution update. In this paper, we provide a comprehensive review of ZO optimization, with an emphasis on showing the underlying intuition, optimization principles and recent advances in convergence analysis. Moreover, we demonstrate promising applications of ZO optimization, such as evaluating robustness and generating explanations from black-box deep learning models, and efficient online sensor management.

show abstract

A Theoretical and Empirical Comparison of Gradient Approximations in Derivative-Free Optimization

Cited by 14 publications

References 13 publications

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

Accelerated gradient methods with absolute and relative noise in the gradient

A Primer on Zeroth-Order Optimization in Signal Processing and Machine Learning: Principals, Recent Advances, and Applications

Contact Info

Product

Resources

About