1974 IEEE Conference on Decision and Control Including the 13th Symposium on Adaptive Processes 1974
DOI: 10.1109/cdc.1974.270399
|View full text |Cite
|
Sign up to set email alerts
|

On the Goldstein-Levitin-Polyak gradient projection method

Abstract: This paper considers some aspects of a gradient projection method proposed by Goldstein [l], Levitin and Polyak [3], and more recently, in a less general context, by McCormick [lo]. We propose and analyze some convergent step-size rules to be used in conjunction with the method. These rules are similar in spirit to the efficient Armijo rule for the method of steepest descent and under mild assumptions they have the desirable property that they identify the set of active inequality consbaints in a f ~ t e numbe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
121
0

Year Published

2003
2003
2013
2013

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 80 publications
(121 citation statements)
references
References 0 publications
0
121
0
Order By: Relevance
“…Letd k be the minimizer of (3). (This minimizer exists and is unique by the strict convexity of the subproblem (3), but we will see later that we do not need to compute it.)…”
Section: Algorithm 21: Inexact Variable Metric Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Letd k be the minimizer of (3). (This minimizer exists and is unique by the strict convexity of the subproblem (3), but we will see later that we do not need to compute it.)…”
Section: Algorithm 21: Inexact Variable Metric Methodsmentioning
confidence: 99%
“…The SPG method is related to the practical version of Bertsekas [3] of the classical gradient projected method of Goldstein, Levitin and Polyak [21,25]. However, some critical differences make this method much more efficient than its gradient-projection predecessors.…”
Section: Introductionmentioning
confidence: 99%
“…The last term on the right hand side, 5) measures the error in the Hessian approximation, along the direction d, due to the use of a smaller sample H k . It is apparent from (5.4) that it is inefficient to require that the residual r k be significantly smaller than the Hessian approximation error ∆ H k (w k ; d), as the extra effort in solving the linear system may not lead to an improved search direction for the objective function J S k (w).…”
Section: The Conjugate Gradient Iterationmentioning
confidence: 99%
“…However, there is a significant difference with other CRF and HCRF models that use such techniques to find optimal parameters: we are constrained to only positive θ-parameters Since we are using a quasi-Newton method with Armijo backtracking line search, we can use the gradient projection method of [12,13] to enforce this constrain. Finally, it is important to stress here that, although our model includes parameters that are not treated probabilistically, we have not seen signs of overfitting in our experiments (see Fig.…”
Section: Model Trainingmentioning
confidence: 99%