2019
DOI: 10.48550/arxiv.1910.13466
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Ordered Memory

Yikang Shen,
Shawn Tan,
Arian Hosseini
et al.

Abstract: Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficulty of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of memory. We also introduce a new Gated Recursive… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 24 publications
0
1
0
Order By: Relevance
“…It is noteworthy that there are many other important miscellaneous works we do not mention in the previous sections. For example, numerous works have proposed to improve upon vanilla gradient-based methods [174,178,65]; linguistic rules such as negation, morphological inflection can be extracted by neural models [141,142,158]; probing tasks can used to explore linguistic properties of sentences [3,80,43,75,89,74,34]; the hidden state dynamics in recurrent nets are analysed to illuminate the learned long-range dependencies [73,96,67,179,94]; [169,166,168,101,57,167] studied the ability of neural sequence models to induce lexical, grammatical and syntactic structures; [91,90,12,136,159,24,151,85] modeled the reasoning process of the model to explain model behaviors; [157,139,28,163,219,170,180,137,106,58,162,81...…”
Section: Miscellaneousmentioning
confidence: 99%
“…It is noteworthy that there are many other important miscellaneous works we do not mention in the previous sections. For example, numerous works have proposed to improve upon vanilla gradient-based methods [174,178,65]; linguistic rules such as negation, morphological inflection can be extracted by neural models [141,142,158]; probing tasks can used to explore linguistic properties of sentences [3,80,43,75,89,74,34]; the hidden state dynamics in recurrent nets are analysed to illuminate the learned long-range dependencies [73,96,67,179,94]; [169,166,168,101,57,167] studied the ability of neural sequence models to induce lexical, grammatical and syntactic structures; [91,90,12,136,159,24,151,85] modeled the reasoning process of the model to explain model behaviors; [157,139,28,163,219,170,180,137,106,58,162,81...…”
Section: Miscellaneousmentioning
confidence: 99%