2019
DOI: 10.48550/arxiv.1905.01332
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Theoretical and Empirical Comparison of Gradient Approximations in Derivative-Free Optimization

Abstract: In this paper, we analyze several methods for approximating the gradient of a function using only function values. These methods include finite differences, linear interpolation, Gaussian smoothing and smoothing on a unit sphere. The methods differ in the number of functions sampled, the choice of the sample points and the way in which the gradient approximations are derived. For each method, we derive bounds on the number of samples and the sampling radius which guarantee favorable convergence properties for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
47
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(48 citation statements)
references
References 13 publications
1
47
0
Order By: Relevance
“…This framework has lead to variants of methods such as stochastic gradient descent [34], SVRG [51] and Adam [22], for example. We note that linear interpolation to orthogonal directions-more similar to traditional model-based DFO-has been shown to outperform gradient sampling as a gradient estimation technique [8,7].…”
Section: Existing Literaturementioning
confidence: 91%
See 1 more Smart Citation
“…This framework has lead to variants of methods such as stochastic gradient descent [34], SVRG [51] and Adam [22], for example. We note that linear interpolation to orthogonal directions-more similar to traditional model-based DFO-has been shown to outperform gradient sampling as a gradient estimation technique [8,7].…”
Section: Existing Literaturementioning
confidence: 91%
“…This gives a total cost of O(p 3 + np). 8 Alternative Point Removal Mechanism Instead of Algorithm 4, we could have used a simpler mechanism for removing points, such as removing the points furthest from the current iterate (with total cost 9 O(np)). However, this leads to a substantial performance penalty.…”
Section: Geometry Managementmentioning
confidence: 99%
“…That is far from what we've declared in section 1. To improve this it's worth to propose more detailed conception rather then (7).…”
Section: Ideas Behind the Resultsmentioning
confidence: 99%
“…particular, we propose an alternative approach to regularization § that is based on "early stopping" ¶ of considered iterative procedure by developing proper stopping rule. Now we explain how to reduce relative inexactness (3) to (7) and to apply (9) when µ ε. Since f (x) has Lipschitz gradient from (3), (7) we may derive that after k iterations (where k is greater than L/µ on a logarithmic factor log LR 2 /ε , where ε -accuracy in function value)…”
Section: Lmentioning
confidence: 99%
See 1 more Smart Citation