2019
DOI: 10.48550/arxiv.1910.12430
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Differentiable Convex Optimization Layers

Akshay Agrawal,
Brandon Amos,
Shane Barratt
et al.

Abstract: Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-sp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
42
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 25 publications
(46 citation statements)
references
References 47 publications
(65 reference statements)
0
42
0
Order By: Relevance
“…A typical benefit of implicit models is that the iterates x i do not need to be stored during the forward pass of the network because gradients can be calculated using the implicit function theorem: it bypasses the memory storage issue of GPUs (Wang et al, 2018;Peng et al, 2017;Zhu et al, 2017) during automatic differentiation. Another application is to consider neural architectures that include an argmin layer, for which the output is also formulated as the solution of a nested optimization problem (Agrawal et al, 2019;Gould et al, 2016Gould et al, , 2019.…”
Section: Attention and Gradient Flowsmentioning
confidence: 99%
“…A typical benefit of implicit models is that the iterates x i do not need to be stored during the forward pass of the network because gradients can be calculated using the implicit function theorem: it bypasses the memory storage issue of GPUs (Wang et al, 2018;Peng et al, 2017;Zhu et al, 2017) during automatic differentiation. Another application is to consider neural architectures that include an argmin layer, for which the output is also formulated as the solution of a nested optimization problem (Agrawal et al, 2019;Gould et al, 2016Gould et al, , 2019.…”
Section: Attention and Gradient Flowsmentioning
confidence: 99%
“…Also related are works that seek to enforce constraints on learning problems [115]. While several heuristic algorithms exist for this setting, many focus on restricted classes of constraints [116][117][118][119][120] and those that can handle more general constraints come at the cost of added computation complexity [121,122]. Moreover, each of these works seeks to enforce constraints on a particular parameterization for the learning problem (such as directly on the weights of a neural network) rather than on the underlying statistical problem, as we do in this paper.…”
Section: Further Related Workmentioning
confidence: 99%
“…Challenges in Batch Optimization: Recently, there has been a strong interest in solving several instances of convex quadratic programming (QP) in parallel [15], [16]. The core innovation in [15] lies in rewriting the underlying matrixalgebra such that matrices that do not change with the batch index can be isolated, and their factorization can be prestored.…”
Section: Connections To Sampling Based Trajectory Optimizationmentioning
confidence: 99%