Wave physics as an analog recurrent neural network

Hughes, Tyler W.; Williamson, Ian A. D.; Minkov, Momchil; Fan, Shanhui

doi:10.1126/sciadv.aay6946

Cited by 273 publications

(180 citation statements)

References 40 publications

(37 reference statements)

Supporting

Mentioning

162

Contrasting

Order By: Relevance

“…We also note that, recently, a mapping between the vanilla RNN's update equations and the wave equation has been proposed in [36] which demonstrates an equivalence between consecutive updates of the hidden state of an RNN and the dynamics of wave propagation. However, while intriguing, this equivalence is limited to a vanilla RNN and we believe that our guiding observations allow for a much broader field of interpretation and experimentation as will be demonstrated by affirming results.…”

Section: Research Methodology a Motivationmentioning

confidence: 84%

Physics-Informed Deep Neural Networks for Transient Electromagnetic Analysis

Noakoasteen

Wang

Peng

et al. 2020

IEEE Open J. Antennas Propag.

View full text Add to dashboard Cite

In this paper, we propose a deep neural network based model to predict the time evolution of field values in transient electrodynamics. The key component of our model is a recurrent neural network, which learns representations of longterm spatial-temporal dependencies in the sequence of its input data. We develop an encoder-recurrent-decoder architecture, which is trained with finite difference time domain simulations of plane wave scattering from distributed, perfect electric conducting objects. We demonstrate that, the trained network can emulate a transient electrodynamics problem with more than 17 times speed-up in simulation time compared to traditional finite difference time domain solvers.

show abstract

Section: Research Methodology a Motivationmentioning

confidence: 84%

Physics-Informed Deep Neural Networks for Transient Electromagnetic Analysis

Noakoasteen

Wang

Peng

et al. 2020

IEEE Open J. Antennas Propag.

View full text Add to dashboard Cite

show abstract

“…At its core, the presented framework can be interpreted as a training regularization method that avoids overfitting of a machine learning hardware to the specific 3D physical structure, distances and operational conditions, which are often assumed to be deterministic, precise and ideal during the training phase. In this respect, beyond its application to practically improve diffractive optical neural networks, the core principles introduced in our work can be extended to train other machine learning platforms [35,50,51] to mitigate various physical error sources that can cause deviations between the designed inference models and their corresponding physical implementations.…”

Section: Discussionmentioning

confidence: 99%

Misalignment resilient diffractive optical networks

et al. 2020

View full text Add to dashboard Cite

AbstractAs an optical machine learning framework, Diffractive Deep Neural Networks (D2NN) take advantage of data-driven training methods used in deep learning to devise light–matter interaction in 3D for performing a desired statistical inference task. Multi-layer optical object recognition platforms designed with this diffractive framework have been shown to generalize to unseen image data achieving, e.g., >98% blind inference accuracy for hand-written digit classification. The multi-layer structure of diffractive networks offers significant advantages in terms of their diffraction efficiency, inference capability and optical signal contrast. However, the use of multiple diffractive layers also brings practical challenges for the fabrication and alignment of these diffractive systems for accurate optical inference. Here, we introduce and experimentally demonstrate a new training scheme that significantly increases the robustness of diffractive networks against 3D misalignments and fabrication tolerances in the physical implementation of a trained diffractive network. By modeling the undesired layer-to-layer misalignments in 3D as continuous random variables in the optical forward model, diffractive networks are trained to maintain their inference accuracy over a large range of misalignments; we term this diffractive network design as vaccinated D2NN (v-D2NN). We further extend this vaccination strategy to the training of diffractive networks that use differential detectors at the output plane as well as to jointly-trained hybrid (optical-electronic) networks to reveal that all of these diffractive designs improve their resilience to misalignments by taking into account possible 3D fabrication variations and displacements during their training phase.

show abstract

“…Therefore the complexity of NML in both theoretical and computational aspects will increase when the latent space dimension increases. Hughes et al (2019) and Sun et al (2020) showed that the wave-equation modeling is equivalent to the recurrent neural network (RNN) and the FWI gradient can be automatically calculated by the AD. Because CAE training also relies on the AD, therefore the AD is a perfect tool to numerically connect a CAE architecture to the wave-equation inversion.…”

Section: Hybrid Machine Learning Inversionmentioning

confidence: 99%

Seismic inversion by Newtonian machine learning

Chen

Schuster

2020

GEOPHYSICS

View full text Add to dashboard Cite

We present a wave-equation inversion method that inverts skeletonized seismic data for the subsurface velocity model. The skeletonized representation of the seismic traces consists of the low-rank latent-space variables predicted by a well-trained autoencoder neural network. The input to the autoencoder consists of seismic traces, and the implicit function theorem is used to determine the Fréchet derivative, i.e., the perturbation of the skeletonized data with respect to the velocity perturbation. The gradient is computed by migrating the shifted observed traces weighted by the skeletonized data residual, and the final velocity model is the one that best predicts the observed latent-space parameters. We denote this as inversion by Newtonian machine learning (NML) because it inverts for the model parameters by combining the forward and backward modeling of Newtonian wave propagation with the dimensional reduction capability of machine learning. Empirical results suggest that inversion by NML can sometimes mitigate the cycle-skipping problem of conventional full-waveform inversion (FWI). Numerical tests with synthetic and field data demonstrate the success of NML inversion in recovering a low-wavenumber approximation to the subsurface velocity model. The advantage of this method over other skeletonized data methods is that no manual picking of important features is required because the skeletal data are automatically selected by the autoencoder. The disadvantage is that the inverted velocity model has less resolution compared with the FWI result, but it can serve as a good initial model for FWI. Our most significant contribution is that we provide a general framework for using wave-equation inversion to invert skeletal data generated by any type of neural network. In other words, we have combined the deterministic modeling of Newtonian physics and the pattern matching capabilities of machine learning to invert seismic data by NML.

show abstract

Wave physics as an analog recurrent neural network

Cited by 273 publications

References 40 publications

Physics-Informed Deep Neural Networks for Transient Electromagnetic Analysis

Physics-Informed Deep Neural Networks for Transient Electromagnetic Analysis

Misalignment resilient diffractive optical networks

Seismic inversion by Newtonian machine learning

Contact Info

Product

Resources

About