Monte Carlo Gradient Estimation in Machine Learning

Mohamed, Shakir; Roșca, Mihaela; Figurnov, Michael; Mnih, Andriy

doi:10.48550/arxiv.1906.10652

Cited by 29 publications

(51 citation statements)

References 69 publications

(99 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In fact, when the ansatz is close to an eigenstate of Ĥ, then E loc (σ) ≈ E, which means that the variance of gradients Var(∂ λj E) ≈ 0 for each variational parameter λ j . We note that this is similar in spirit to the control variate methods in Monte Carlo and to the baseline methods in reinforcement learning [51].…”

Section: Variational Monte Carlomentioning

confidence: 59%

See 1 more Smart Citation

Variational neural annealing

et al. 2021

View full text Add to dashboard Cite

Many important challenges in science and technology can be cast as optimization problems. When viewed in a statistical physics framework, these can be tackled by simulated annealing, where a gradual cooling procedure helps search for groundstate solutions of a target Hamiltonian. While powerful, simulated annealing is known to have prohibitively slow sampling dynamics when the optimization landscape is rough or glassy. Here we show that by generalizing the target distribution with a parameterized model, an analogous annealing framework based on the variational principle can be used to search for groundstate solutions. Modern autoregressive models such as recurrent neural networks provide ideal parameterizations since they can be exactly sampled without slow dynamics even when the model encodes a rough landscape. We implement this procedure in the classical and quantum settings on several prototypical spin glass Hamiltonians, and find that it significantly outperforms traditional simulated annealing in the asymptotic limit, illustrating the potential power of this yet unexplored route to optimization.

show abstract

Section: Variational Monte Carlomentioning

confidence: 59%

“…Here, we can subtract the term E in order to reduce noise in the stochastic estimation of our gradients without introducing a bias [20,51]. In fact, when the ansatz is close to an eigenstate of Ĥ, then E loc (σ) ≈ E, which means that the variance of gradients Var(∂ λj E) ≈ 0 for each variational parameter λ j .…”

Section: Variational Monte Carlomentioning

confidence: 99%

Variational neural annealing

et al. 2021

View full text Add to dashboard Cite

show abstract

“…We remark that π in the agent optimization (7a) and π in the adversary optimization (7b) are different, as are the pessimistic and optimistic hallucination policies η (p) and η (o) . In particular, we learn a critic via fitted Q-iteration (Perolat et al, 2015;Antos et al, 2008) and then differentiate through the critic using pathwise gradients (Mohamed et al, 2019;Silver et al, 2014) using stochastic gradient ascent for the agent and stochastic gradient descent for the adversary.…”

Section: Practical Implementationmentioning

confidence: 99%

Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning

Curi,

Bogunovic,

Krause

2021

Preprint

View full text Add to dashboard Cite

In real-world tasks, reinforcement learning (RL) agents frequently encounter situations that are not present during training time. To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations. The robust RL framework addresses this challenge via a worst-case optimization between an agent and an adversary. Previous robust RL algorithms are either sample inefficient, lack robustness guarantees, or do not scale to large problems. We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem while attaining near-optimal sample complexity guarantees. RH-UCRL is a model-based reinforcement learning (MBRL) algorithm that effectively distinguishes between epistemic and aleatoric uncertainty, and efficiently explores both the agent and adversary decision spaces during policy learning. We scale RH-UCRL to complex tasks via neural networks ensemble models as well as neural network policies. Experimentally, we demonstrate that RH-UCRL outperforms other robust deep RL algorithms in a variety of adversarial environments.Main contributions. We design RH-UCRL, the first practical provably robust RL algorithm that is: (i) sample-efficient, (ii) compatible with deep models, and (iii) simulator-free as it addresses exploration on a real system. We establish rig-

show abstract

“…This improves training stability and is a standard technique deployed in machine learning to improve gradient estimation [38,39].…”

Section: Training Detailsmentioning

confidence: 99%

Autoregressive neural-network wavefunctions for ab initio quantum chemistry

Barrett¹,

Malyshev²,

Lvovsky³

2021

Preprint

View full text Add to dashboard Cite

Performing electronic structure calculations is a canonical many-body problem that has recently emerged as a challenging new paradigm for neural network quantum states (NNQS). Here, we parameterise the electronic wavefunction with a novel autoregressive neural network (ARN) that permits highly efficient and scalable sampling, whilst also embedding physical priors that reflect the structure of molecular systems without sacrificing expressibility. This allows us to perform electronic structure calculations on molecules with up to 30 spin-orbitals -which consider multiple orders of magnitude more Slater determinants than previous applications of conventional NNQS -and we find that our ansatz can outperform the de-facto gold-standard coupled cluster methods even in the presence of strong quantum correlations. With a highly expressive neural network for which sampling is no longer a computational bottleneck, we conclude that the barriers to further scaling are not associated with the wavefunction ansatz itself, but rather are inherent to any variational Monte Carlo approach.

show abstract

Monte Carlo Gradient Estimation in Machine Learning

Cited by 29 publications

References 69 publications

Variational neural annealing

Variational neural annealing

Combining Pessimism with Optimism for Robust and Efficient Model-Based Deep Reinforcement Learning

Autoregressive neural-network wavefunctions for ab initio quantum chemistry

Contact Info

Product

Resources

About