Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Shen, Yikang; Tan, Shawn W.; Sordoni, Alessandro; Courville, Aaron

doi:10.48550/arxiv.1810.09536

Cited by 22 publications

(72 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In this paper, we introduce the Ordered Memory architecture. The model is conceptually close to previous stack-augmented RNNs, but with two important differences: 1) we replace the pop and push operations with a new writing and erasing mechanism inspired by Ordered Neurons (Shen et al, 2018); 2) we also introduce a new Gated Recursive Cell to compose lower level representations into higher level one. On the logical inference and ListOps tasks, we show that the model learns the proper tree structures required to solve them.…”

Section: Discussionmentioning

confidence: 99%

“…Ordered Memory is implemented following the principles introduced in Ordered Neurons (Shen et al, 2018). Our model is related to ON-LSTM in several aspects: 1) The memory slots are similar to the chunks in ON-LSTM, when a higher ranking memory slot is forgotten/updated, all lower ranking memory slots should likewise be forgotten/updated; 2) ON-LSTM uses the monotonically non-decreasing master forget gate to preserve long-term information while erasing short term information, the OM model uses the cumulative probability − → π t ; 3) Similarly, the master input gate used by ON-LSTM to control the writing of new information into the memory is replaced with the reversed cumulative probability ← − π t in the OM model.…”

Section: Relations To On-lstm and Shift-reduce Parsermentioning

confidence: 99%

“…We use a multi-layer perceptron (MLP) with (h 1 , h 2 , h 1 • h 2 , |h 1 −h 2 |) as input, where h 1 and h 2 are the M N T of their respective input sequences. We compare our model with LSTM, RRNet (Jacob et al, 2018), ON-LSTM (Shen et al, 2018), Tranformer (Vaswani et al, 2017), Universal Transformer (Dehghani et al, 2018), TreeLSTM (Tai et al, 2015), TreeRNN (Bowman et al, 2015), and TreeCell. We used the same hidden state size for our model and baselines for proper comparison.…”

Section: Logical Inferencementioning

confidence: 99%

“…In recent years, many attempts have been made in this direction using neural network architectures (Grefenstette et al, 2015;Bowman et al, 2016;Williams et al, 2018;Shen et al, 2018;. These models typically augment a recurrent neural network with a stack and a buffer which operate in a similar way to how a shift-reduce parser builds a parse-tree.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Ordered Memory

Shen,

Tan,

Hosseini

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

Stack-augmented recurrent neural networks (RNNs) have been of interest to the deep learning community for some time. However, the difficulty of training memory models remains a problem obstructing the widespread use of such models. In this paper, we propose the Ordered Memory architecture. Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of memory. We also introduce a new Gated Recursive Cell to compose lower level representations into higher level representation. We demonstrate that our model achieves strong performance on the logical inference task (Bowman et al., 2015) and the ListOps (Nangia and Bowman, 2018) task. We can also interpret the model to retrieve the induced tree structure, and find that these induced structures align with the ground truth. Finally, we evaluate our model on the Stanford Sentiment Treebank tasks (Socher et al., 2013), and find that it performs comparatively with the state-of-the-art methods in the literature 2 .

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Relations To On-lstm and Shift-reduce Parsermentioning

confidence: 99%

Section: Logical Inferencementioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Ordered Memory

Shen,

Tan,

Hosseini

et al. 2019

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Ordered-Neurons LSTMs (ON-LSTMs) Based on the intuition that larger constituents contain information that changes more slowly across the sentence, Shen et al [2018] suggested a variant of LSTMs, called Ordered-Neurons LSTMs, which imposes a hierarchical bias on the cell-updating mechanism. Given the hierarchical nature of our data, we expected ON-LSTMs to perform well on the number-agreements tasks.…”

Section: Rnns With a Structural Biasmentioning

confidence: 99%

Can RNNs learn Recursive Nested Subject-Verb Agreements?

Lakretz,

Desbordes,

King

et al. 2021

Preprint

View full text Add to dashboard Cite

One of the fundamental principles of contemporary linguistics states that language processing requires the ability to extract recursively nested tree structures. However, it remains unclear whether and how this code could be implemented in neural circuits. Recent advances in Recurrent Neural Networks (RNNs), which achieve near-human performance in some language tasks, provide a compelling model to address such questions. Here, we present a new framework to study recursive processing in RNNs, using subject-verb agreement as a probe into the representations of the neural network. We trained six distinct types of RNNs on a simplified probabilistic context-free grammar designed to independently manipulate the length of a sentence and the depth of its syntactic tree. All RNNs generalized to subject-verb dependencies longer than those seen during training. However, none systematically generalized to deeper tree structures, even those with a structural bias towards learning nested tree (i.e., stack-RNNs). In addition, our analyses revealed primacy and recency effects in the generalization patterns of LSTM-based models, showing that these models tend to perform well on the outer-and innermost parts of a center-embedded tree structure, but poorly on its middle levels. Finally, probing the internal states of the model during the processing of sentences with nested tree structures, we found a complex encoding of grammatical agreement information (e.g. grammatical number), in which all the information for multiple words nouns was carried by a single unit. Taken together, these results indicate how neural networks may extract bounded nested tree structures, without learning a systematic recursive rule.

show abstract

A parallel model of DenseCNN and ordered‐neuron LSTM for generic and species‐specific succinylation site prediction

Wang

Zhao

Zhang

et al. 2022

Biotech & Bioengineering

View full text Add to dashboard Cite

Lysine succinylation (Ksucc) regulates various metabolic processes, participates in vital life processes, and is involved in the occurrence and development of numerous diseases. Accurate recognition of succinylation sites can reveal underlying functional mechanisms and pathogenesis. However, most remain undetected. Moreover, a deep learning architecture focusing on generic and species‐specific predictions is still lacking. Thus, we proposed a deep learning‐based framework named Deep‐Ksucc, combining a dense convolutional network and ordered‐neuron long short‐term memory in parallel, which took the cascading characteristics of sequence information and physicochemical properties as the input. The results of the generic and species‐specific predictions indicated that Deep‐Ksucc can identify sequence patterns of different organisms and recognize plenty of succinylation sites. The case study showed that Deep‐Ksucc can serve as a reliable tool for biology verification and computer‐aided recognition of succinylation sites.

show abstract

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Cited by 22 publications

References 29 publications

Ordered Memory

Ordered Memory

Can RNNs learn Recursive Nested Subject-Verb Agreements?

A parallel model of DenseCNN and ordered‐neuron LSTM for generic and species‐specific succinylation site prediction

Contact Info

Product

Resources

About