Despite its important applications in Machine Learning, min-max optimization of objective functions that are nonconvex-nonconcave remains elusive. Not only are there no known first-order methods converging to even approximate local min-max equilibria (a.k.a. approximate saddle points), but the computational complexity of identifying them is also poorly understood. In this paper, we provide a characterization of the computational complexity as well as of the limitations of first-order methods in this problem. Specifically, we show that in linearly constrained min-max optimization problems with nonconvex-nonconcave objectives an approximate local minmax equilibrium of large enough approximation is guaranteed to exist, but computing such a point is PPAD-complete. The same is true of computing an approximate fixed point of the (Projected) Gradient Descent/Ascent update dynamics, which is computationally equivalent to computing approximate local min-max equilibria. An important byproduct of our proof is to establish an unconditional hardness result in the Nemirovsky-Yudin [36] oracle optimization model, where we are given oracle access to the values of some function f : P → [−1, 1] and its gradient ∇f , where P ⊆ [0, 1] d is a known convex polytope. We show that any algorithm that uses such first-order oracle access to f and finds an ε-approximate local min-max equilibrium needs to make a number of oracle queries that is exponential in at least one of 1/ε, L, G, or d, where L and G are respectively the smoothness and Lipschitzness of f . This comes in sharp contrast to minimization problems, where finding approximate local minima in the same setting can be done with Projected Gradient Descent using O(L/ε) many queries. Our result is the first to show an exponential separation between these two fundamental optimization problems in the oracle model. CCS CONCEPTS• Theory of computation → Problems, reductions and completeness; Complexity classes; • Mathematics of computing → Nonconvex optimization.
Despite its important applications in Machine Learning, min-max optimization of objective functions that are nonconvex-nonconcave remains elusive. Not only are there no known firstorder methods converging even to approximate local min-max points, but the computational complexity of identifying them is also poorly understood. In this paper, we provide a characterization of the computational complexity of the problem, as well as of the limitations of firstorder methods in constrained min-max optimization problems with nonconvex-nonconcave objectives and linear constraints.As a warm-up, we show that, even when the objective is a Lipschitz and smooth differentiable function, deciding whether a min-max point exists, in fact even deciding whether an approximate min-max point exists, is NP-hard. More importantly, we show that an approximate local min-max point of large enough approximation is guaranteed to exist, but finding one such point is PPAD-complete. The same is true of computing an approximate fixed point of the (Projected) Gradient Descent/Ascent update dynamics.An important byproduct of our proof is to establish an unconditional hardness result in the Nemirovsky-Yudin [NY83] oracle optimization model. We show that, given oracle access to some function f : P → [−1, 1] and its gradient ∇ f , where P ⊆ [0, 1] d is a known convex polytope, every algorithm that finds a ε-approximate local min-max point needs to make a number of queries that is exponential in at least one of 1/ε, L, G, or d, where L and G are respectively the smoothness and Lipschitzness of f and d is the dimension. This comes in sharp contrast to minimization problems, where finding approximate local minima in the same setting can be done with Projected Gradient Descent using O(L/ε) many queries. Our result is the first to show an exponential separation between these two fundamental optimization problems in the oracle model.
We study the multistage K-facility reallocation problem on the real line, where we maintain K facility locations over T stages, based on the stage-dependent locations of n agents. Each agent is connected to the nearest facility at each stage, and the facilities may move from one stage to another, to accommodate different agent locations. The objective is to minimize the connection cost of the agents plus the total moving cost of the facilities, over all stages. K-facility reallocation was introduced by de Keijzer and Wojtczak [10], where they mostly focused on the special case of a single facility. Using an LP-based approach, we present a polynomial time algorithm that computes the optimal solution for any number of facilities. We also consider online K-facility reallocation, where the algorithm becomes aware of agent locations in a stage-by-stage fashion. By exploiting an interesting connection to the classical K-server problem, we present a constant-competitive algorithm for K = 2 facilities. 1 consecutive timesteps. The stability of the solutions is modeled by introducing an additional moving cost (or switching cost), which has a different definition depending on the particular setting. Model and Motivation.In this work, we study the multistage K-facility reallocation problem on the real line, introduced by de Keijzer and Wojtczak [10]. In K-facility reallocation, K facilities are initially located at (x 0 1 , . . . , x 0 K ) on the real line. Facilities are meant to serve n agents for the next T days. At each day, each agent connects to the facility closest to its location and incurs a connection cost equal to this distance. The locations of the agents may change every day, thus we have to move facilities accordingly in order to reduce the connection cost. Naturally, moving a facility is not for free, but comes with the price of the distance that the facility was moved. Our goal is to specify the exact positions of the facilities at each day so that the total connection cost plus the total moving cost is minimized over all T days. In the online version of the problem, the positions of the agents at each stage t are revealed only after determining the locations of the facilities at stage t − 1.For a motivating example, consider a company willing to advertise its products. To this end, it organizes K advertising campaigns at different locations of a large city for the next T days. Based on planned events, weather forecasts, etc., the company estimates a population distribution over the locations of the city for each day. Then, the company decides to compute the best possible campaign reallocation with K campaigns over all days (see also [10] for more examples).de Keijzer and Wojtczak [10] fully characterized the optimal offline and online algorithms for the special case of a single facility and presented a dynamic programming algorithm for K ≥ 1 facilities with running time exponential in K. Despite the practical significance and the interesting theoretical properties of Kfacility reallocation, its computationa...
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.