Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences

Jacob, Athul Paul; Lin, Zhouhan; Sordoni, Alessandro; Bengio, Yoshua

doi:10.18653/v1/w18-3020

Cited by 12 publications

(9 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Although this concept can be related to the prioritisation of information in the human visual cortex (Hassabis et al, 2017), it seems contrary to the incremental processing of information in a language context, as for instance recently shown empirically for the understanding of conjunctive generic sentences (Tessler et al, 2019). In machine learning, the idea of incrementality has already played a role in several problem statements, such as inferring the tree structure of a sentence (Jacob et al, 2018), parsing (Köhn and Menzel, 2014), or in other problems that are naturally equipped with time constraints like realtime neural machine translation (Neubig et al, 2017;Dalvi et al, 2018a), and speech recognition (Baumann et al, 2009;Jaitly et al, 2016;Graves, 2012). Other approaches try to encourage incremental behavior implictly by modifying the model architecture or the training objective: Guan et al (2018) introduce an encoder with an incremental self-attention scheme for story generation.…”

Section: Related Workmentioning

confidence: 99%

Assessing Incrementality in Sequence-to-Sequence Models

Ulmer

Hupkes

Bruni

2019

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

View full text Add to dashboard Cite

Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. In particular, because past representations can be revisited at any point in time, attention-centric methods seem to lack an incentive to build up incrementally more informative representations of incoming sentences. This way of processing stands in stark contrast with the way in which humans are believed to process language: continuously and rapidly integrating new information as it is encountered. In this work, we propose three novel metrics to assess the behavior of RNNs with and without an attention mechanism and identify key differences in the way the different model types process sentences.

show abstract

Section: Related Workmentioning

confidence: 99%

Assessing Incrementality in Sequence-to-Sequence Models

Ulmer

Hupkes

Bruni

2019

Proceedings of the 4th Workshop on Representation Learning for NLP (RepL4NLP-2019)

View full text Add to dashboard Cite

show abstract

“…We compare CRvNN with Tree-LSTM (Tai et al, 2015), Tree-Cell (Shen et al, 2019a) Tree-RNN (Bowman et al, 2015b), Tranformer (Vaswani et al, 2017), Universal Transformer (Dehghani et al, 2019), LSTM (Hochreiter & Schmidhuber, 1997), RRNet (Jacob et al, 2018), ON-LSTM (Shen et al, 2019b), Ordered Memory (Shen et al, 2019a) (see Table 2).…”

Section: Logical Inferencementioning

confidence: 99%

Modeling Hierarchical Structures with Continuous Recursive Neural Networks

Chowdhury,

Caragea

2021

Preprint

View full text Add to dashboard Cite

Recursive Neural Networks (RvNNs), which compose sequences according to their underlying hierarchical syntactic structure, have performed well in several natural language processing tasks compared to similar models without structural biases. However, traditional RvNNs are incapable of inducing the latent structure in a plain text sequence on their own. Several extensions have been proposed to overcome this limitation. Nevertheless, these extensions tend to rely on surrogate gradients or reinforcement learning at the cost of higher bias or variance. In this work, we propose Continuous Recursive Neural Network (CRvNN) as a backpropagation-friendly alternative to address the aforementioned limitations. This is done by incorporating a continuous relaxation to the induced structure. We demonstrate that CRvNN achieves strong performance in challenging synthetic tasks such as logical inference (Bowman et al., 2015b) and ListOps (Nangia & Bowman, 2018). We also show that CRvNN performs comparably or better than prior latent structure models on real-world tasks such as sentiment analysis and natural language inference. 1

show abstract

“…The recursive application of the same composition function is well suited for this task. We also include the result of RRNet (Jacob et al, 2018), which can induce the latent tree structure from downstream tasks. Note that the results may not be comparable, because the hyper-parameters for training were not provided.…”

Section: Logical Inferencementioning

confidence: 99%

“…years (Bowman et al, 2016;Yogatama et al, 2016;Shen et al, 2017;Jacob et al, 2018;Choi et al, 2018;Williams et al, 2018;Shi et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Shen¹,

Tan²,

Sordoni³

et al. 2018

Preprint

Self Cite

View full text Add to dashboard Cite

Natural language is hierarchically structured: smaller units (e.g., phrases) are nested within larger units (e.g., clauses). When a larger constituent ends, all of the smaller constituents that are nested within it must also be closed. While the standard LSTM architecture allows different neurons to track information at different time scales, it does not have an explicit bias towards modeling a hierarchy of constituents. This paper proposes to add such an inductive bias by ordering the neurons; a vector of master input and forget gates ensures that when a given neuron is updated, all the neurons that follow it in the ordering are also updated. Our novel recurrent architecture, ordered neurons LSTM (ON-LSTM), achieves good performance on four different tasks: language modeling, unsupervised parsing, targeted syntactic evaluation, and logical inference 1 . * Equal contribution. {yi-kang.shen,jing.shan.shawn.tan}@umontreal.ca.

show abstract

Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences

Cited by 12 publications

References 13 publications

Assessing Incrementality in Sequence-to-Sequence Models

Assessing Incrementality in Sequence-to-Sequence Models

Modeling Hierarchical Structures with Continuous Recursive Neural Networks

Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks

Contact Info

Product

Resources

About