2014
DOI: 10.48550/arxiv.1410.3916
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Memory Networks

Jason Weston,
Sumit Chopra,
Antoine Bordes

Abstract: We describe a new class of learning models called memory networks. Memory networks reason with inference components combined with a long-term memory component; they learn how to use these jointly. The long-term memory can be read and written to, with the goal of using it for prediction. We investigate these models in the context of question answering (QA) where the long-term memory effectively acts as a (dynamic) knowledge base, and the output is a textual response. We evaluate them on a large-scale QA task, a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
351
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 266 publications
(351 citation statements)
references
References 13 publications
(18 reference statements)
0
351
0
Order By: Relevance
“…However, the information within the memory cells is highly compressed and has limited representation ability. To overcome this issue, memory networks [40] were introduced to explicitly store the important features. A commonly used memory network in video object segmentation is STM [24] which incrementally adds the features of past frames to the memory bank, and leverages the non-local spatio-temporal matching to provide spatio-temporal features.…”
Section: Memory Networkmentioning
confidence: 99%
“…However, the information within the memory cells is highly compressed and has limited representation ability. To overcome this issue, memory networks [40] were introduced to explicitly store the important features. A commonly used memory network in video object segmentation is STM [24] which incrementally adds the features of past frames to the memory bank, and leverages the non-local spatio-temporal matching to provide spatio-temporal features.…”
Section: Memory Networkmentioning
confidence: 99%
“…A growing area of research is that of augmenting generative models with external knowledge. Earlier works such as Memory Networks (Weston et al, 2014) and DrQA (Chen et al, 2017) utilized TFIDF-based retrieval over documents to provide additional input to neural models for the task of question answering, following the well studied area of non-neural methods that use retrieval for QA (Voorhees, 2001). More recently, the RAG (Retrieval-Augmented Generation) and FiD (Fusion-in-Decoder) (Izacard and Grave, 2020) models developed these ideas further, using a neural retriever as well, with superior results.…”
Section: Related Workmentioning
confidence: 99%
“…But sometimes they fail to work well since the memory capacity is too small to accurately record all the contents of the sequential data. Recently, Weston et al [47] introduced the memory networks that use a specialized memory bank that can be read and written and perform better memorization. However, it is hard to train the memory network via backpropagation due to the need of supervision for each layer during training.…”
Section: Related Workmentioning
confidence: 99%
“…, N }). Similar to [47,35], the memory is content addressable which has a specific addressing scheme. It is addressed by computing the attention weights w based on the similarity between the query q k and each item m i in the memory bank.…”
Section: Block-wise Memory Modulementioning
confidence: 99%