Abstract:This paper presents an efficient method for extracting the second-order sensitivities from a system of implicit nonlinear equations on upcoming graphical processing units (GPU) dominated computer systems. We design a custom automatic differentiation (AutoDiff) backend that targets highly parallel architectures by extracting the second-order information in batch. When the nonlinear equations are associated to a reduced space optimization problem, we leverage the parallel reverse-mode accumulation in a batched a… Show more
“…Hence, we avoid computing the full sensitivity matrix S and rely instead on a batched variant of the adjoint-adjoint algorithm [28]. First, we compute an LU factorization of G x , as P G x Q = L U , with P and Q two permutation matrices and L and U being respectively a lower and an upper triangular matrix (using SpRF, the factorization can be updated entirely on the GPU if the sparsity pattern of G x is the same along the iterations).…”
Section: Porting the Reduction Algorithm To The Gpumentioning
The interior-point method (IPM) has become the workhorse method for nonlinear programming. The performance of IPM is directly related to the linear solver employed to factorize the Karush-Kuhn-Tucker (KKT) system at each iteration of the algorithm. When solving large-scale nonlinear problems, state-of-the art IPM solvers rely on efficient sparse linear solvers to solve the KKT system. Instead, we propose a novel reduced-space IPM algorithm that condenses the KKT system into a dense matrix whose size is proportional to the number of degrees of freedom in the problem. Depending on where the reduction occurs we derive two variants of the reduced-space method: linearize-then-reduce and reduce-then-linearize. We adapt their workflow so that the vast majority of computations are accelerated on GPUs. We provide extensive numerical results on the optimal power flow problem, comparing our GPU-accelerated reduced space IPM with Knitro and a hybrid full space IPM algorithm. By evaluating the derivatives on the GPU and solving the KKT system on the CPU, the hybrid solution is already significantly faster than the CPU-only solutions. The two reduced-space algorithms go one step further by solving the KKT system entirely on the GPU. As expected, the performance of the two reduction algorithms depends intrinsically on the number of available degrees of freedom: their performance is poor when the problem has many degrees of freedom, but the two algorithms are up to 3 times faster than Knitro as soon as the relative number of degrees of freedom becomes smaller.Mihai Anitescu dedicates this work to the 70-th birthday of Florian Potra. Florian, thank you for the great contributions to optimization in general, and interior point methods in particular, and for initiating me and many others in them.
“…Hence, we avoid computing the full sensitivity matrix S and rely instead on a batched variant of the adjoint-adjoint algorithm [28]. First, we compute an LU factorization of G x , as P G x Q = L U , with P and Q two permutation matrices and L and U being respectively a lower and an upper triangular matrix (using SpRF, the factorization can be updated entirely on the GPU if the sparsity pattern of G x is the same along the iterations).…”
Section: Porting the Reduction Algorithm To The Gpumentioning
The interior-point method (IPM) has become the workhorse method for nonlinear programming. The performance of IPM is directly related to the linear solver employed to factorize the Karush-Kuhn-Tucker (KKT) system at each iteration of the algorithm. When solving large-scale nonlinear problems, state-of-the art IPM solvers rely on efficient sparse linear solvers to solve the KKT system. Instead, we propose a novel reduced-space IPM algorithm that condenses the KKT system into a dense matrix whose size is proportional to the number of degrees of freedom in the problem. Depending on where the reduction occurs we derive two variants of the reduced-space method: linearize-then-reduce and reduce-then-linearize. We adapt their workflow so that the vast majority of computations are accelerated on GPUs. We provide extensive numerical results on the optimal power flow problem, comparing our GPU-accelerated reduced space IPM with Knitro and a hybrid full space IPM algorithm. By evaluating the derivatives on the GPU and solving the KKT system on the CPU, the hybrid solution is already significantly faster than the CPU-only solutions. The two reduced-space algorithms go one step further by solving the KKT system entirely on the GPU. As expected, the performance of the two reduction algorithms depends intrinsically on the number of available degrees of freedom: their performance is poor when the problem has many degrees of freedom, but the two algorithms are up to 3 times faster than Knitro as soon as the relative number of degrees of freedom becomes smaller.Mihai Anitescu dedicates this work to the 70-th birthday of Florian Potra. Florian, thank you for the great contributions to optimization in general, and interior point methods in particular, and for initiating me and many others in them.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.