Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many important new contributions in the last years. These novel developments are grounded on recent advances in convex analysis, discrete optimization, parallel processing, and nonsmooth optimization with emphasis on sparsity issues. In this paper, we aim at presenting the principles of primal-dual approaches, while giving an overview of numerical methods which have been proposed in different contexts. We show the benefits which can be drawn from primal-dual algorithms both for solving large-scale convex optimization problems and discrete ones, and we provide various application examples to illustrate their usefulness.
Index TermsConvex optimization, discrete optimization, duality, linear programming, proximal methods, inverse problems, computer vision, machine learning, big data N. Komodakis (corresponding author) and J.-C. Pesquet are with the Optimization [1] is an extremely popular paradigm which constitutes the backbone of many branches of applied mathematics and engineeering, such as signal processing, computer vision, machine learning, inverse problems, and network communications, to mention just a few. The popularity of optimization approaches often stems from the fact that many problems from the above fields are typically characterized by a lack of closed form solutions and by uncertainties. In signal and image processing, for instance, uncertainties can be introduced due to noise, sensor imperfectness, or ambiguities that are often inherent in the visual interpretation. As a result, perfect or exact solutions hardly exist, whereas inexact but optimal (in a statistical or an application-specific sense) solutions and their efficient computation is what one aims at. At the same time, one important characteristic that is nowadays shared by increasingly many optimization problems encountered in the above areas is the fact that these problems are often of very large scale. A good example is the field of computer vision where one often needs to solve low level problems that require associating at least one (and typically more than one) variable to each pixel of an image (or even worse of an image sequence as in the case of video) [2]. This leads to problems that easily can contain millions of variables, which are therefore the norm rather than the exception in this context.
Similarly, in fields like machine learning [3], [4], due to the great ease with which data can now be collected and stored, quite often one has to cope with truly massive datasets and to train very large models, which thus naturally lead to optimization problems of very high dimensionality [5]. Of course, a similar situation arises in many other scient...