Neural Random-Access Machines

Kurach, Karol; Andrychowicz, Marcin; Sutskever, Ilya

doi:10.48550/arxiv.1511.06392

Cited by 29 publications

(32 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Neural Algorithm Execution. Many works have studied neural execution in different domains before (Zaremba & Sutskever, 2014;Kaiser & Sutskever, 2015;Kurach et al, 2015;Reed & De Freitas, 2015;Santoro et al, 2018;Yan et al, 2020). With the rapid development of GNNs in graph representation learning, learning graph algorithms with GNNs has attracted researchers' attention (Veličković et al, 2019;Xhonneux et al, 2021).…”

Section: Related Workmentioning

confidence: 99%

Neural Approximation of Graph Topological Features

Yan¹,

Ma²,

Gao³

et al. 2022

Preprint

View full text Add to dashboard Cite

Persistent homology is a widely used theory in topological data analysis. In the context of graph learning, topological features based on persistent homology have been used to capture potentially high-order structural information so as to augment existing graph neural network methods. However, computing extended persistent homology summaries remains slow for large and dense graphs, especially since in learning applications one has to carry out this computation potentially many times. Inspired by recent success in neural algorithmic reasoning, we propose a novel learning method to compute extended persistence diagrams on graphs. The proposed neural network aims to simulate a specific algorithm and learns to compute extended persistence diagrams for new graphs efficiently. Experiments on approximating extended persistence diagrams and several downstream graph representation learning tasks demonstrate the effectiveness of our method. Our method is also efficient; on large and dense graphs, we accelerate the computation by nearly 100 times.

show abstract

Section: Related Workmentioning

confidence: 99%

Neural Approximation of Graph Topological Features

Yan¹,

Ma²,

Gao³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…A number of methods augment the short-and long-term memory internal to recurrent networks with external "working" memory, in order to realize differentiable programming architectures that can learn to model and execute various programs , Sukhbaatar et al, 2015, Joulin and Mikolov, 2015, Reed and de Freitas, 2015, Grefenstette et al, 2015, Kurach et al, 2015. Unlike our approach, these methods explicitly decouple memory from computation, mimicking a standard computer architecture.…”

Section: Related Workmentioning

confidence: 99%

Multigrid Neural Memory

Huynh,

Maire,

Walter

2019

Preprint

View full text Add to dashboard Cite

We introduce a novel architecture that integrates a large addressable memory space into the core functionality of a deep neural network. Our design distributes both memory addressing operations and storage capacity over many network layers. Distinct from strategies that connect neural networks to external memory banks, our approach co-locates memory with computation throughout the network structure. Mirroring recent architectural innovations in convolutional networks, we organize memory into a multiresolution hierarchy, whose internal connectivity enables learning of dynamic information routing strategies and data-dependent read/write operations. This multigrid spatial layout permits parameter-efficient scaling of memory size, allowing us to experiment with memories substantially larger than those in prior work. We demonstrate this capability on synthetic exploration and mapping tasks, where the network is able to self-organize and retain long-term memory for trajectories of thousands of time steps. On tasks decoupled from any notion of spatial geometry, such as sorting or associative recall, our design functions as a truly generic memory and yields results competitive with those of the recently proposed Differentiable Neural Computer .Preprint. Under review.

show abstract

“…A central challenge of non-convex optimization is avoiding sub-optimal local minima. Although it has been shown that the variable can sometimes converges to a neighborhood of the global minimum by adding noise [16,17,18,19,20], the convergence rate is still a problem. Note that the DP method has some probability to escape "appropriately shallow" local minima because the moving direction of the variable is generated by solving several sub-problems instead of the original problem.…”

Section: Related Workmentioning

confidence: 99%

SVGD: A Virtual Gradients Descent Method for Stochastic Optimization

Li¹,

Shu²

2019

Preprint

View full text Add to dashboard Cite

Inspired by dynamic programming, we propose Stochastic Virtual Gradient Descent (SVGD) algorithm where the Virtual Gradient is defined by computational graph and automatic differentiation. The method is computationally efficient and has little memory requirements. We also analyze the theoretical convergence properties and implementation of the algorithm. Experimental results on multiple datasets and network models show that SVGD has advantages over other stochastic optimization methods.

show abstract

Neural Random-Access Machines

Cited by 29 publications

References 18 publications

Neural Approximation of Graph Topological Features

Neural Approximation of Graph Topological Features

Multigrid Neural Memory

SVGD: A Virtual Gradients Descent Method for Stochastic Optimization

Contact Info

Product

Resources

About