Memory Networks

Weston, Jason; Chopra, Sumit; Bordes, Antoine

doi:10.48550/arxiv.1410.3916

Cited by 266 publications

(351 citation statements)

References 13 publications

(18 reference statements)

Supporting

Mentioning

351

Contrasting

Order By: Relevance

“…However, the information within the memory cells is highly compressed and has limited representation ability. To overcome this issue, memory networks [40] were introduced to explicitly store the important features. A commonly used memory network in video object segmentation is STM [24] which incrementally adds the features of past frames to the memory bank, and leverages the non-local spatio-temporal matching to provide spatio-temporal features.…”

Section: Memory Networkmentioning

confidence: 99%

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

Miao¹,

Bennamoun²,

Gao³

et al. 2021

Preprint

View full text Add to dashboard Cite

We propose a self-supervised spatio-temporal matching method coined Motion-Aware Mask Propagation (MAMP) for semi-supervised video object segmentation. During training, MAMP leverages the frame reconstruction task to train the model without the need for annotations. During inference, MAMP extracts high-resolution features from each frame to build a memory bank from the features as well as the predicted masks of selected past frames. MAMP then propagates the masks from the memory bank to subsequent frames according to our motion-aware spatio-temporal matching module, also proposed in this paper. Evaluation on DAVIS-2017 and YouTube-VOS datasets show that MAMP achieves state-of-the-art performance with stronger generalization ability compared to existing self-supervised methods, i.e. 4.9% higher mean J &F on DAVIS-2017 and 4.85% higher mean J &F on the unseen categories of YouTube-VOS than the nearest competitor. Moreover, MAMP performs on par with many supervised video object segmentation methods. Our code is available at: https: //github.com/bo-miao/MAMP.

show abstract

Section: Memory Networkmentioning

confidence: 99%

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

Miao¹,

Bennamoun²,

Gao³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…A growing area of research is that of augmenting generative models with external knowledge. Earlier works such as Memory Networks (Weston et al, 2014) and DrQA (Chen et al, 2017) utilized TFIDF-based retrieval over documents to provide additional input to neural models for the task of question answering, following the well studied area of non-neural methods that use retrieval for QA (Voorhees, 2001). More recently, the RAG (Retrieval-Augmented Generation) and FiD (Fusion-in-Decoder) (Izacard and Grave, 2020) models developed these ideas further, using a neural retriever as well, with superior results.…”

Section: Related Workmentioning

confidence: 99%

Internet-Augmented Dialogue Generation

Komeili¹,

Shuster²,

Weston³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

The largest store of continually updating knowledge on our planet can be accessed via internet search. In this work we study giving access to this information to conversational agents. Large language models, even though they store an impressive amount of knowledge within their weights, are known to hallucinate facts when generating dialogue (Shuster et al., 2021); moreover, those facts are frozen in time at the point of model training. In contrast, we propose an approach that learns to generate an internet search query based on the context, and then conditions on the search results to finally generate a response, a method that can employ up-to-the-minute relevant information. We train and evaluate such models on a newly collected dataset of human-human conversations whereby one of the speakers is given access to internet search during knowledgedriven discussions in order to ground their responses. We find that search-query based access of the internet in conversation provides superior performance compared to existing approaches that either use no augmentation or FAISS-based retrieval .

show abstract

“…But sometimes they fail to work well since the memory capacity is too small to accurately record all the contents of the sequential data. Recently, Weston et al [47] introduced the memory networks that use a specialized memory bank that can be read and written and perform better memorization. However, it is hard to train the memory network via backpropagation due to the need of supervision for each layer during training.…”

Section: Related Workmentioning

confidence: 99%

“…, N }). Similar to [47,35], the memory is content addressable which has a specific addressing scheme. It is addressed by computing the attention weights w based on the similarity between the query q k and each item m i in the memory bank.…”

Section: Block-wise Memory Modulementioning

confidence: 99%

Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detection

Hou¹,

Zhang²,

Zhong

et al. 2021

Preprint

View full text Add to dashboard Cite

Reconstruction-based methods play an important role in unsupervised anomaly detection in images. Ideally, we expect a perfect reconstruction for normal samples and poor reconstruction for abnormal samples. Since the generalizability of deep neural networks is difficult to control, existing models such as autoencoder do not work well. In this work, we interpret the reconstruction of an image as a divide-and-assemble procedure. Surprisingly, by varying the granularity of division on feature maps, we are able to modulate the reconstruction capability of the model for both normal and abnormal samples. That is, finer granularity leads to better reconstruction, while coarser granularity leads to poorer reconstruction. With proper granularity, the gap between the reconstruction error of normal and abnormal samples can be maximized. The divide-andassemble framework is implemented by embedding a novel multi-scale block-wise memory module into an autoencoder network. Besides, we introduce adversarial learning and explore the semantic latent representation of the discriminator, which improves the detection of subtle anomaly. We achieve state-of-the-art performance on the challenging MVTec AD dataset. Remarkably, we improve the vanilla autoencoder model by 10.1% in terms of the AUROC score.

show abstract

Memory Networks

Cited by 266 publications

References 13 publications

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

Self-Supervised Video Object Segmentation by Motion-Aware Mask Propagation

Internet-Augmented Dialogue Generation

Divide-and-Assemble: Learning Block-wise Memory for Unsupervised Anomaly Detection

Contact Info

Product

Resources

About