Investigating Bi-Level Optimization for Learning and Vision from a Unified Perspective: A Survey and Beyond

Liu, Risheng; Gao, Jiaxin; Zhang, Jin; Meng, Deyu; Lin, Zhouchen

doi:10.48550/arxiv.2101.11517

Cited by 15 publications

(22 citation statements)

References 133 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, as a value-function, ϕ(x) is non-smooth, non-convex, and even with jumps, thus ill-conditioned, so we use a smooth function to approximate ϕ(x) and obtain ∂ϕ(x) ∂x . Existing methods can be classified into two categories according to divergent ways to calculate ∂ϕ(x) ∂x [20], [35], [36], i.e., Explicit Gradient-Based Methods (EGBMs), which derives the gradient by Automatic Differentiation (AD), and Implicit Gradient-Based Methods (IGBMs), which apply the implicit function theorem to deal with the optimality conditions of LL problems.…”

Section: Related Workmentioning

confidence: 99%

“…C URRENTLY, a number of important machine learning and deep learning tasks can be captured by hierarchical models, such as hyper-parameter optimization [1], [2], [3], [4], neural architecture search [5], [6], [7], meta learning [8], [9], [10], Generative Adversarial Networks (GAN) [11], [12], reinforcement learning [13], image processing [14], [15], [16], [17], and so on. In general, these hierarchical models can be formulated as the following Bi-Level Optimization (BLO) problem [18], [19], [20]: " min x∈X " F (x, y), s.t. y ∈ S(x) := arg min y f (x, y), (1) where x ∈ X is the Upper-Level (UL) variable, y ∈ R n is the Lower-Level (LL) variable, the UL objective F (x, y) : X ×R n → R and the LL objective f (x, y) : R m ×R n → R, are continuously differentiable and jointly continuous functions, and the UL constraint X ⊂ R m is a compact set.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Value-Function-based Sequential Minimization for Bi-level Optimization

Liu¹,

Li²,

Zeng³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Gradient-based Bi-Level Optimization (BLO) methods have been widely applied to handle modern learning tasks. However, most existing strategies are theoretically designed based on restrictive assumptions (e.g., convexity of the lower-level sub-problem), and computationally not applicable for high-dimensional tasks. Moreover, there are almost no gradient-based methods able to solve BLO in those challenging scenarios, such as BLO with functional constraints and pessimistic BLO. In this work, by reformulating BLO into approximated single-level problems, we provide a new algorithm, named Bi-level Value-Function-based Sequential Minimization (BVFSM), to address the above issues. Specifically, BVFSM constructs a series of value-function-based approximations, and thus avoids repeated calculations of recurrent gradient and Hessian inverse required by existing approaches, time-consuming especially for high-dimensional tasks. We also extend BVFSM to address BLO with additional functional constraints. More importantly, BVFSM can be used for the challenging pessimistic BLO, which has never been properly solved before. In theory, we prove the convergence of BVFSM on these types of BLO, in which the restrictive lower-level convexity assumption is completely discarded. To our best knowledge, this is the first gradient-based algorithm that can solve different kinds of BLO (e.g., optimistic, pessimistic, and with constraints) with solid convergence guarantees. Extensive experiments verify the theoretical investigations and demonstrate our superiority on various real-world applications.

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Value-Function-based Sequential Minimization for Bi-level Optimization

Liu¹,

Li²,

Zeng³

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…In the past decade, researchers have discovered numerous applications of bi-level programming in machine learning, including meta-learning (ML) [9], adversarial learning [16], hyperparameter optimization [23] and neural architecture search [19]. These newly found bilevel programs in ML are often solved by descent methods, which require differentiating through the (usually unconstrained) lower-level optimization problem [20]. The differentiation can be carried out either implicitly on the optimality conditions as in the conventional sensitivity analysis [see e.g., 1, 32, 3], or explicitly by unrolling the numerical procedure used to solve the lower-level problem [see e.g., 23,10].…”

Section: Related Workmentioning

confidence: 99%

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima

Boyi¹,

Li²,

Yang³

et al. 2021

Preprint

View full text Add to dashboard Cite

To regulate a social system comprised of self-interested agents, economic incentives (e.g., taxes, tolls, and subsidies) are often required to induce a desirable outcome. This incentive design problem naturally possesses a bi-level structure, in which an upperlevel "designer" modifies the payoffs of the agents with incentives while anticipating the response of the agents at the lower level, who play a non-cooperative game that converges to an equilibrium. The existing bi-level optimization algorithms developed in machine learning raise a dilemma when applied to this problem: anticipating how incentives affect the agents at equilibrium requires solving the equilibrium problem repeatedly, which is computationally inefficient; bypassing the time-consuming step of equilibriumfinding can reduce the computational cost, but may lead the designer to a sub-optimal solution. To address such a dilemma, we propose a method that tackles the designer's and agents' problems simultaneously in a single loop. In particular, at each iteration, both the designer and the agents only move one step based on the first-order information. In the proposed scheme, although the designer does not solve the equilibrium problem repeatedly, it can anticipate the overall influence of the incentives on the agents, which guarantees optimality. We prove that the algorithm converges to the global optima at a sublinear rate for a broad class of games.

show abstract

“…First introduced in the field of economic game theory by Stackelberg (1934), this problem has recently received increasing attention in the machine learning community (Domke, 2012;Gould et al, 2016;Liao et al, 2018;Blondel et al, 2021;Liu et al, 2021;Shaban et al, 2019). Indeed, many machine learning applications can be reduced to (1) including hyper-parameter optimization (Feurer and Hutter, 2019), meta-learning (Bertinetto et al, 2018), reinforcement learning (Hong et al, 2020b;Liu et al, 2021) or dictionary learning (Mairal et al, 2011;Lecouat et al, 2020a;b).…”

Section: Introductionmentioning

confidence: 99%

Amortized Implicit Differentiation for Stochastic Bilevel Optimization

Arbel,

Mairal

2021

Preprint

View full text Add to dashboard Cite

We study a class of algorithms for solving bilevel optimization problems in both stochastic and deterministic settings when the inner-level objective is strongly convex. Specifically, we consider algorithms based on inexact implicit differentiation and we exploit a warm-start strategy to amortize the estimation of the exact gradient. We then introduce a unified theoretical framework inspired by the study of singularly perturbed systems (Habets, 1974) to analyze such amortized algorithms. By using this framework, our analysis shows these algorithms to match the computational complexity of oracle methods that have access to an unbiased estimate of the gradient, thus outperforming many existing results for bilevel optimization. We illustrate these findings on synthetic experiments and demonstrate the efficiency of these algorithms on hyper-parameter optimization experiments involving several thousands of variables.

show abstract

Investigating Bi-Level Optimization for Learning and Vision from a Unified Perspective: A Survey and Beyond

Cited by 15 publications

References 133 publications

Value-Function-based Sequential Minimization for Bi-level Optimization

Value-Function-based Sequential Minimization for Bi-level Optimization

Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds Global Optima

Amortized Implicit Differentiation for Stochastic Bilevel Optimization

Contact Info

Product

Resources

About