Hybrid computing using a neural network with dynamic external memory

Graves, Alex; Wayne, Greg; Reynolds, Malcolm; Harley, Tim; Danihelka, Ivo; Grabska-Barwińska, Agnieszka; Colmenarejo, Sergio Gómez; Grefenstette, Edward; Ramalho, Tiago; Agapiou, John P.; Badia, Adrià Puigdomènech; Hermann, Karl Moritz; Zwólš, Yori; Ostrovski, Georg; Cain, Adam; King, Helen; Summerfield, Christopher; Blunsom, Phil; Kavukcuoglu, Koray; Hassabis, Demis

doi:10.1038/nature20101

Cited by 1,230 publications

(1,034 citation statements)

References 26 publications

Supporting

Mentioning

1,025

Contrasting

Unclassified

Order By: Relevance

“…However, this could be reversed, giving a device that learns to construct context-free programs (e.g., expression trees) given only observed outputs; one application would be unsupervised parsing. Such an extension of the work would make it an alternative to architectures that have an explicit external memory such as neural Turing machines (Graves et al, 2014) and memory networks (Weston et al, 2015). However, as with those models, without supervision of the stack operations, formidable computational challenges must be solved (e.g., marginalizing over all latent stack operations), but sampling techniques and techniques from reinforcement learning have promise here (Zaremba and Sutskever, 2015), making this an intriguing avenue for future work.…”

Section: Resultsmentioning

confidence: 99%

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Dyer¹,

Ballesteros²,

Wang³

et al. 2015

Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Confere

524

559

View full text Add to dashboard Cite

We propose a technique for learning representations of parser states in transitionbased dependency parsers. Our primary innovation is a new control structure for sequence-to-sequence neural networksthe stack LSTM. Like the conventional stack data structures used in transitionbased parsing, elements can be pushed to or popped from the top of the stack in constant time, but, in addition, an LSTM maintains a continuous space embedding of the stack contents. This lets us formulate an efficient parsing model that captures three facets of a parser's state: (i) unbounded look-ahead into the buffer of incoming words, (ii) the complete history of actions taken by the parser, and (iii) the complete contents of the stack of partially built tree fragments, including their internal structures. Standard backpropagation techniques are used for training and yield state-of-the-art parsing performance.

show abstract

Section: Resultsmentioning

confidence: 99%

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Dyer¹,

Ballesteros²,

Wang³

et al. 2015

Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Confere

524

559

View full text Add to dashboard Cite

show abstract

“…Nevertheless, TD-LSTM might not work well when the opinion word is far from the target, because the captured feature is likely to be lost ( reported similar problems of LSTM-based models in machine translation). (Graves et al, 2014) introduced the concept of memory for NNs and proposed a differentiable process to read and write memory, which is called Neural Turing Machine (NTM). Attention mechanism, which has been used successfully in many areas Rush et al, 2015), can be treated as a simplified version of NTM because the size of memory is unlimited and we only need to read from it.…”

Section: Related Workmentioning

confidence: 99%

Recurrent Attention Network on Memory for Aspect Sentiment Analysis

Chen¹,

Sun²,

Bing

et al. 2017

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

815

466

View full text Add to dashboard Cite

We propose a novel framework based on neural networks to identify the sentiment of opinion targets in a comment/review. Our framework adopts multiple-attention mechanism to capture sentiment features separated by a long distance, so that it is more robust against irrelevant information. The results of multiple attentions are non-linearly combined with a recurrent neural network, which strengthens the expressive power of our model for handling more complications. The weightedmemory mechanism not only helps us avoid the labor-intensive feature engineering work, but also provides a tailor-made memory for different opinion targets of a sentence. We examine the merit of our model on four datasets: two are from SemEval2014, i.e. reviews of restaurants and laptops; a twitter dataset, for testing its performance on social media data; and a Chinese news comment dataset, for testing its language sensitivity. The experimental results show that our model consistently outperforms the state-of-the-art methods on different types of data.

show abstract

“…This distinguishes the keyvariable memory from other memory-augmented neural networks that use continuous differentiable embeddings as the values of memory entries (Weston et al, 2014;Graves et al, 2016a).…”

Section: Entity Resolvermentioning

confidence: 99%

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

Chen

Berant

et al. 2017

Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers)

314

298

View full text Add to dashboard Cite

Harnessing the statistical power of neural networks to perform language understanding and symbolic reasoning is difficult, when it requires executing efficient discrete operations against a large knowledge-base. In this work, we introduce a Neural Symbolic Machine (NSM), which contains (a) a neural "programmer", i.e., a sequence-to-sequence model that maps language utterances to programs and utilizes a key-variable memory to handle compositionality (b) a symbolic "computer", i.e., a Lisp interpreter that performs program execution, and helps find good programs by pruning the search space. We apply REINFORCE to directly optimize the task reward of this structured prediction problem. To train with weak supervision and improve the stability of REINFORCE we augment it with an iterative maximum-likelihood training process. NSM outperforms the state-of-theart on the WEBQUESTIONSSP dataset when trained from question-answer pairs only, without requiring any feature engineering or domain-specific knowledge.

show abstract

Hybrid computing using a neural network with dynamic external memory

Cited by 1,230 publications

References 26 publications

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Recurrent Attention Network on Memory for Aspect Sentiment Analysis

Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

Contact Info

Product

Resources

About