Mattias Fält scite author profile

Boyd

2016

Many popular first order algorithms for convex optimization, such as forward-backward splitting, Douglas-Rachford splitting, and the alternating direction method of multipliers (ADMM), can be formulated as averaged iteration of a nonexpansive mapping. In this paper we propose a line search for averaged iteration that preserves the theoretical convergence guarantee, while often accelerating practical convergence. We discuss several general cases in which the additional computational cost of the line search is modest compared to the savings obtained. IntroductionFirst-order algorithms such as forward-backward splitting, Douglas-Rachford splitting, and the alternating direction methods of multipliers (ADMM) are often used for large-scale convex optimization. While the theory tells us that these methods converge, practical convergence can be very slow for some problem instances. One effective method to reduce the number of iterations is to precondition the problem data. This approach has been extensively studied in the literature and has proven very successful in practice; see, e.g., [4,7,22,16,18,19] for a limited selection of such approaches.Another general approach to improving practical efficiency is to carry out a line search, i.e., to first compute a tentative next iterate and then to select the next iterate on the ray from the current iterate passing through the tentative iterate. Typical line searches are based on some readily computed quantity such as the function value or norm of the gradient or residual. A well designed line search preserves the theoretical convergence of the base method, while accelerating the practical convergence. Line search is widely used in gradient descent or Newton methods; see [6,24]. These line search methods cannot be applied to all first-order methods mentioned above, however, since in general there is no readily computed quantity that is decreasing. (The convergence proofs for these methods typically rely on quantities related to the distance to an optimal point, which cannot be evaluated while the algorithm is running.) In this paper we propose a general line search scheme that is applicable to most first-order convex optimization methods, including those mentioned above whose convergence proofs are not based on the decrease of an observable quantity.We exploit the fact that many first-order optimization algorithms can be viewed as averaged iterations of some nonexpansive operator, i.e., they can be written in the formwhereᾱ ∈ (0, 1) and S : R n → R n is nonexpansive, i.e., it satisfies Su−Sv 2 ≤ u − v 2 for all u, v. The superscript k denotes iteration number. The middle expression shows that the next point is a weighted average of the current point x k and Sx k . The expression on the righthand side of (1) shows that the iteration can be interpreted as a taking a step of lengthᾱ in the direction of the fixedpoint residual r k = Sx k − x k . Assuming a fixed-point exists, the iteration (1) converges to the set of fixed-points.In this paper we will show how steps som...

Optimal convergence rates for generalized alternating projections

2017

Generalized alternating projections is an algorithm that alternates relaxed projections onto a finite number of sets to find a point in their intersection. We consider the special case of two linear subspaces, for which the algorithm reduces to a matrix iteration. For convergent matrix iterations, the asymptotic rate is linear and decided by the magnitude of the subdominant eigenvalue. In this paper, we show how to select the three algorithm parameters to optimize this magnitude, and hence the asymptotic convergence rate. The obtained rate depends on the Friedrichs angle between the subspaces and is considerably better than known rates for other methods such as alternating projections and Douglas-Rachford splitting. We also present an adaptive scheme that, online, estimates the Friedrichs angle and updates the algorithm parameters based on this estimate. A numerical example is provided that supports our theoretical claims and shows very good performance for the adaptive method.Definition 1 The principal angles θ k ∈ [0, π/2], k = 1, . . . , p between two subspaces U, V ∈ R n , where p = min(dim U, dim V), are recursively defined by cos θ k := max u k ∈U , v k ∈V u k , v k s.t. u k = v k = 1, u k , v i = u i , v k = 0, ∀ i = 1, ..., k − 1.

Line search for generalized alternating projections

2017

This paper is about line search for the generalized alternating projections (GAP) method. This method is a generalization of the von Neumann alternating projections method, where instead of performing alternating projections, relaxed projections are alternated. The method can be interpreted as an averaged iteration of a nonexpansive mapping. Therefore, a recently proposed line search method for such algorithms is applicable to GAP. We evaluate this line search and show situations when the line search can be performed with little additional cost. We also present a variation of the basic line search for GAP-the projected line search. We prove its convergence and show that the line search condition is convex in the step length parameter. We show that almost all convex optimization problems can be solved using this approach and numerical results show superior performance with both the standard and the projected line search, sometimes by several orders of magnitude, compared to the nominal method.

Optimal Convergence Rates for Generalized Alternating Projections

Fält¹,

Giselsson²

2017

Preprint

Envelope Functions: Unifications and Further Properties

2018

J Optim Theory Appl

Forward-backward and Douglas-Rachford splitting are methods for structured nonsmooth optimization. With the aim to use smooth optimization techniques for nonsmooth problems, the forward-backward and Douglas-Rachford envelopes where recently proposed. Under specific problem assumptions, these envelope functions have favorable smoothness and convexity properties and their stationary points coincide with the fixed-points of the underlying algorithm operators. This allows for solving such nonsmooth optimization problems by minimizing the corresponding smooth convex envelope function. In this paper, we present a general envelope function that unifies and generalizes existing ones. We provide properties of the general envelope function that sharpen corresponding known results for the special cases. We also present a new interpretation of the underlying methods as being majorizationminimization algorithms applied to their respective envelope functions.