In a real Hilbert space H, we study the fast convergence properties as t → +∞ of the trajectories of the second-order evolution equationẍ (t) + α tẋ (t) + ∇Φ(x(t)) = 0,where ∇Φ is the gradient of a convex continuously differentiable function Φ : H → R, and α is a positive parameter. In this inertial system, the viscous damping coefficient α t vanishes asymptotically in a moderate way. For α > 3, we show that any trajectory converges weakly to a minimizer of Φ, just assuming that argmin Φ = ∅. The strong convergence is established in various practical situations. These results complement the O(t −2 ) rate of convergence for the values obtained by Su, Boyd and Candès. Time discretization of this system, and some of its variants, provides new fast converging algorithms, expanding the field of rapid methods for structured convex minimization introduced by Nesterov, and further developed by Beck and Teboulle. This study also complements recent advances due to Chambolle and Dossal.3. Relationship with fast numerical optimization methods: As pointed out in [38, Section 2], for α = 3, (1) can be seen as a continuous version of the fast convergent method of Nesterov (see [29,30,31,32]), and its widely used successors, such as the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), studied in [19]. These methods have a convergence rate of Φ(x k ) − min Φ = O(k −2 ), where k is the number of iterations. As for the continuous-time system (1), convergence of the sequences generated by FISTA and related methods has not been established so far. This is a central and long-standing question in the study of numerical optimization methods.The purpose of this research is to establish the convergence of the trajectories satisfying (1), as well as the sequences generated by the corresponding numerical methods with Nesterov-type acceleration. We also complete the study with several stability properties concerning both the continuous-time system and the algorithms.
This paper shows that error bounds can be used as effective tools for deriving complexity results for first-order descent methods in convex minimization. In a first stage, this objective led us to revisit the interplay between error bounds and the Kurdyka-Lojasiewicz (KL) inequality. One can show the equivalence between the two concepts for convex functions having a moderately flat profile near the set of minimizers (as those of functions with Hölderian growth). A counterexample shows that the equivalence is no longer true for extremely flat functions. This fact reveals the relevance of an approach based on KL inequality. In a second stage, we show how KL inequalities can in turn be employed to compute new complexity bounds for a wealth of descent methods for convex problems. Our approach is completely original and makes use of a one-dimensional worst-case proximal sequence in the spirit of the famous majorant method of Kantorovich. Our result applies to a very simple abstract scheme that covers a wide class of descent methods. As a byproduct of our study, we also provide new results for the globalization of KL inequalities in the convex framework.Our main results inaugurate a simple methodology: derive an error bound, compute the desingularizing function whenever possible, identify essential constants in the descent method and finally compute the complexity using the one-dimensional worst case proximal sequence. Our method is illustrated through projection methods for feasibility problems, and through the famous iterative shrinkage thresholding algorithm (ISTA), for which we show that the complexity bound is of the form O(q k ) where the constituents of the bound only depend on error bound constants obtained for an arbitrary least squares objective with ℓ 1 regularization.
Abstract. The forward-backward algorithm is a powerful tool for solving optimization problems with a additively separable and smooth + nonsmooth structure. In the convex setting, a simple but ingenious acceleration scheme developed by Nesterov has been proved useful to improve the theoretical rate of convergence for the function values from the standard O(k −1 ) down to O(k −2 ). In this short paper, we prove that the rate of convergence of a slight variant of Nesterov's accelerated forward-backward method, which produces convergent sequences, is actually o(k −2 ), rather than O(k −2 ). Our arguments rely on the connection between this algorithm and a second-order differential inclusion with vanishing damping.Final version published at SIOPT.
We study the convergence of general descent methods applied to a lower semi-continuous and nonconvex function, which satisfies the Kurdyka-Łojasiewicz inequality in a Hilbert space. We prove that any precompact sequence converges to a critical point of the function, and obtain new convergence rates both for the values and the iterates. The analysis covers alternating versions of the forward-backward method with variable metric and relative errors. As an example, a nonsmooth and nonconvex version of the Levenberg-Marquardt algorithm is detailed.
In a Hilbert space setting, for convex optimization, we analyze the convergence rate of a class of first-order algorithms involving inertial features. They can be interpreted as discrete time versions of inertial dynamics involving both viscous and Hessian-driven dampings. The geometrical damping driven by the Hessian intervenes in the dynamics in the form ∇ 2 f (x(t))ẋ(t). By treating this term as the time derivative of ∇f (x(t)), this gives, in discretized form, first-order algorithms in time and space. In addition to the convergence properties attached to Nesterov-type accelerated gradient methods, the algorithms thus obtained are new and show a rapid convergence towards zero of the gradients. On the basis of a regularization technique using the Moreau envelope, we extend these methods to non-smooth convex functions with extended real values. The introduction of time scale factors makes it possible to further accelerate these algorithms. We also report numerical results on structured problems to support our theoretical findings.
We study the behavior of the trajectories of a second-order differential equation with vanishing damping, governed by the Yosida regularization of a maximally monotone operator with time-varying index, along with a new Regularized Inertial Proximal Algorithm obtained by means of a convenient finite-difference discretization. These systems are the counterpart to accelerated forward-backward algorithms in the context of maximally monotone operators. A proper tuning of the parameters allows us to prove the weak convergence of the trajectories to zeroes of the operator. Moreover, it is possible to estimate the rate at which the speed and acceleration vanish. We also study the effect of perturbations or computational errors that leave the convergence properties unchanged. We also analyze a growth condition under which strong convergence can be guaranteed. A simple example shows the criticality of the assumptions on the Yosida approximation parameter, and allows us to illustrate the behavior of these systems compared with some of their close relatives.
Abstract. We are concerned with the study of a class of forward-backward penalty schemes for solving variational inequalities 0 ∈ Ax + NC (x) where H is a real Hilbert space, A : H ⇉ H is a maximal monotone operator, and NC is the outward normal cone to a closed convex set C ⊂ H. Let Ψ : H → R be a convex differentiable function whose gradient is Lipschitz continuous, and which acts as a penalization function with respect to the constraint x ∈ C. Given a sequence (βn) of penalization parameters which tends to infinity, and a sequence of positive time steps (λn) ∈ ℓ 2 \ ℓ 1 , we consider the diagonal forward-backward algorithm xn+1 = (I + λnA) −1 (xn − λnβn∇Ψ(xn)).Assuming that (βn) satisfies the growth condition lim sup n→∞ λnβn < 2/θ (where θ is the Lipschitz constant of ∇Ψ), we obtain weak ergodic convergence of the sequence (xn) to an equilibrium for a general maximal monotone operator A. We also obtain weak convergence of the whole sequence (xn) when A is the subdifferential of a proper lower-semicontinuous convex function. As a key ingredient of our analysis, we use the cocoerciveness of the operator ∇Ψ. When specializing our results to coupled systems, we bring new light on Passty's Theorem, and obtain convergence results of new parallel splitting algorithms for variational inequalities involving coupling in the constraint. We also establish robustness and stability results that account for numerical approximation errors. An illustration to compressive sensing is given.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.