Key-Value Memory Networks for Directly Reading Documents

Miller, Alexander H.; Fisch, Adam; Dodge, Jesse; Karimi, Amir-Hossein; Bordes, Antoine; Weston, Jason

doi:10.18653/v1/d16-1147

Cited by 781 publications

(542 citation statements)

References 17 publications

Supporting

Mentioning

511

Contrasting

Unclassified

Order By: Relevance

“…We would also like to investigate alternatives to reinforcement learning for implementing sparse attention, e.g. sparsemax (Martins and Astudillo, 2016) and key-value memory networks (Miller et al, 2016) (preliminary investigations with sparsemax were not extremely promising, but we leave this to future work). Resolving these issues can allow attention models to become more scalable, especially in computationally intensive tasks such as document summarization.…”

Section: Resultsmentioning

confidence: 99%

Coarse-to-Fine Attention Models for Document Summarization

Ling¹,

Rushton²

2017

Proceedings of the Workshop on New Frontiers in Summarization

View full text Add to dashboard Cite

Sequence-to-sequence models with attention have been successful for a variety of NLP problems, but their speed does not scale well for tasks with long source sequences such as document summarization. We propose a novel coarse-to-fine attention model that hierarchically reads a document, using coarse attention to select top-level chunks of text and fine attention to read the words of the chosen chunks. While the computation for training standard attention models scales linearly with source sequence length, our method scales with the number of top-level chunks and can handle much longer sequences. Empirically, we find that while coarse-tofine attention models lag behind state-ofthe-art baselines, our method achieves the desired behavior of sparsely attending to subsets of the document for generation.

show abstract

Section: Resultsmentioning

confidence: 99%

Coarse-to-Fine Attention Models for Document Summarization

Ling¹,

Rushton²

2017

Proceedings of the Workshop on New Frontiers in Summarization

View full text Add to dashboard Cite

show abstract

“…Compared IRNs to Memory Networks (MemNN) Sukhbaatar et al, 2015;Miller et al, 2016) and Neural Turing Machines (NTM) (Graves et al, 2014(Graves et al, , 2016, the biggest difference between our model and the existing frameworks is the controller and the use of the shared memory. We follow Shen et al (2017) for using a controller module to dynamically perform a multi-step inference depending on the complexity of the instance.…”

Section: Related Workmentioning

confidence: 98%

Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion

Shen

Huang

Chang

et al. 2017

Proceedings of the 2nd Workshop on Representation Learning for NLP

View full text Add to dashboard Cite

Recent studies on knowledge base completion, the task of recovering missing relationships based on recorded relations, demonstrate the importance of learning embeddings from multi-step relations. However, due to the size of knowledge bases, learning multi-step relations directly on top of observed triplets could be costly. Hence, a manually designed procedure is often used when training the models. In this paper, we propose Implicit ReasoNets (IRNs), which is designed to perform multi-step inference implicitly through a controller and shared memory. Without a human-designed inference procedure, IRNs use training data to learn to perform multi-step inference in an embedding neural space through the shared memory and controller. While the inference procedure does not explicitly operate on top of observed triplets, our proposed model outperforms all previous approaches on the popular FB15k benchmark by more than 5.7%.

show abstract

“…Memory is an effective way to equip seq2seq systems with external information (Weston et al, 2014;Sukhbaatar et al, 2015;Miller et al, 2016;Kumar et al, 2015). GenQA (Yin et al, 2015) applies a seq2seq model to generate natural answer sentences from a knowledge base, and CoreQA (He et al, 2017b) extends it with copying mechanism (Gu et al, 2016).…”

Section: Related Workmentioning

confidence: 99%

Natural Answer Generation with Heterogeneous Memory

Fu¹,

Feng²

2018

Proceedings of the 2018 Conference of the North American Chapter Of the Association for Computational Linguistics: Hu

View full text Add to dashboard Cite

Memory augmented encoder-decoder framework has achieved promising progress for natural language generation tasks. Such frameworks enable a decoder to retrieve from a memory during generation. However, less research has been done to take care of the memory contents from different sources, which are often of heterogeneous formats. In this work, we propose a novel attention mechanism to encourage the decoder to actively interact with the memory by taking its heterogeneity into account. Our solution attends across the generated history and memory to explicitly avoid repetition, and introduce related knowledge to enrich our generated sentences. Experiments on the answer sentence generation task show that our method can effectively explore heterogeneous memory to produce readable and meaningful answer sentences while maintaining high coverage for given answer information.

show abstract

Key-Value Memory Networks for Directly Reading Documents

Cited by 781 publications

References 17 publications

Coarse-to-Fine Attention Models for Document Summarization

Coarse-to-Fine Attention Models for Document Summarization

Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion

Natural Answer Generation with Heterogeneous Memory

Contact Info

Product

Resources

About