This paper deals with neural networks as dynamical systems governed by finite difference equations. It shows that the introduction of k-many skip connections into network architectures, such as residual networks and additive dense networks, define k th order dynamical equations on the layer-wise transformations. Closedform solutions for the state space representations of general k th order additive dense networks, where the concatenation operation is replaced by addition, as well as k th order smooth networks, are found. The developed provision endows deep neural networks with an algebraic structure. Furthermore, it is shown that imposing k th order smoothness on network architectures with d-many nodes per layer increases the state space dimension by a multiple of k, and so the effective embedding dimension of the data manifold by the neural network is k · d-many dimensions. It follows that network architectures of these types reduce the number of parameters needed to maintain the same embedding dimension by a factor of k 2 when compared to an equivalent first-order, residual network. Numerical simulations and experiments on CIFAR10, SVHN, and MNIST have been conducted to help understand the developed theory and efficacy of the proposed concepts. universal approximator [9], where it is learning something similar to a piecewise linear finite-mesh approximation of the data manifold.Recent work consistent with the original intuition of learning perturbations from the identity has shown that residual networks, with their first-order perturbation term, can be formulated as a finite difference approximation of a first-order differential equation [5]. This has the interesting consequence that residual networks are C 1 smooth dynamic equations through the layers of the network. Additionally, one may then define entire classes of C k differentiable transformations over the layers, and then induce network architectures from their finite difference approximations.Work by Chang et al.[3] considered residual neural networks as forward difference approximations to C 1 transformations as well. This work has been extended to develop new network architectures by using central differencing, as opposed to forward differencing, to approximate the set of coupled first order differential equations, called the Midpoint Network [2]. Similarly, other researchers have used different numerical schemes to approximate the first order ordinary differential equations, such as the linear multistep method to develop the Linear Multistep-architecture [10]. This is different from the previous work [5] where entire classes of finite differencing approximations to k th order differential equations are defined. Haber and Ruthutto [4] considered how stability techniques from finite difference methods can be applied to improve first and second order smooth neural networks. For example, they suggest requiring that the real part of the eigenvalues from the Jacobian transformations be approximately equal to zero. This ensures that little information about...
Generative Adversarial Networks (GANs) have been shown to be powerful and flexible priors when solving inverse problems. One challenge of using them is overcoming representation error, the fundamental limitation of the network in representing any particular signal. Recently, multiple proposed inversion algorithms reduce representation error by optimizing over intermediate layer representations. These methods are typically applied to generative models that were trained agnostic of the downstream inversion algorithm. In our work, we introduce a principle that if a generative model is intended for inversion using an algorithm based on optimization of intermediate layers, it should be trained in a way that regularizes those intermediate layers. We instantiate this principle for two notable recent inversion algorithms: Intermediate Layer Optimization and the Multi-Code GAN prior. For both of these inversion algorithms, we introduce a new regularized GAN training algorithm and demonstrate that the learned generative model results in lower reconstruction errors across a wide range of under sampling ratios when solving compressed sensing, inpainting, and super-resolution problems.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.