Jointly learning sentence embeddings and syntax with unsupervised Tree-LSTMs

Maillard, Jean; Clark, Stephen; Yogatama, Dani

doi:10.1017/s1351324919000184

Cited by 62 publications

(98 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Recently, there has been growing interest in providing an inductive bias in neural network by forcing layers to represent tree structures (Kim et al, 2017;Maillard et al, 2017;Choi et al, 2018;Niculae et al, 2018;Williams et al, 2018a;Liu and Lapata, 2018). Maillard et al (2017) also operates on a chart but, rather than modeling discrete trees, uses a soft-gating approach to mix representations of constituents in each given cell. While these models showed consistent improvement over comparable baselines, they do not seem to explicitly capture syntactic or semantic structures (Williams et al, 2018a).…”

Section: Related Workmentioning

confidence: 99%

Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming

Corro¹,

Titov²

2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

We treat projective dependency trees as latent variables in our probabilistic model and induce them in such a way as to be beneficial for a downstream task, without relying on any direct tree supervision. Our approach relies on Gumbel perturbations and differentiable dynamic programming. Unlike previous approaches to latent tree learning, we stochastically sample global structures and our parser is fully differentiable. We illustrate its effectiveness on sentiment analysis and natural language inference tasks. We also study its properties on a synthetic structure induction task. Ablation studies emphasize the importance of both stochasticity and constraining latent structures to be projective trees.

show abstract

Section: Related Workmentioning

confidence: 99%

Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming

Corro¹,

Titov²

2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

show abstract

“…Typically, recursive neural network models assume that an annotated treebank or a pretrained syntactic parser is available (Socher et al, 2013;Tai et al, 2015;Kim et al, 2019a), but recent work pays more attention to learning syntactic structures in an unsupervised manner. Yogatama et al (2017) propose to use reinforcement learning, and Maillard et al (2017) introduce the Tree-LSTM to jointly learn sentence embeddings and syntax trees, later combined with a Straight-Through Gumbel-Softmax estimator by Choi et al (2018). In addition to sentence classification tasks, recent research has focused on unsupervised structure learning for language modeling (Shen et al, 2018(Shen et al, , 2019Drozdov et al, 2019;Kim et al, 2019b).…”

Section: Related Workmentioning

confidence: 99%

“…In early approaches, unsupervised parsers were trained by optimizing the marginal likelihood of sentences (Klein and Manning, 2014). More recent deep learning approaches (Yogatama et al, 2017;Maillard et al, 2017;Choi et al, 2018) obtain latent tree structures by reinforcement learning (RL). Typically, this involves a secondary task, e.g., a language modeling objective or a semantic task.…”

Section: Introductionmentioning

confidence: 99%

An Imitation Learning Approach to Unsupervised Parsing

Li¹,

Mou

Keller³

2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

Recently, there has been an increasing interest in unsupervised parsers that optimize semantically oriented objectives, typically using reinforcement learning. Unfortunately, the learned trees often do not match actual syntax trees well. Shen et al. (2018) propose a structured attention mechanism for language modeling (PRPN), which induces better syntactic structures but relies on ad hoc heuristics. Also, their model lacks interpretability as it is not grounded in parsing actions. In our work, we propose an imitation learning approach to unsupervised parsing, where we transfer the syntactic knowledge induced by the PRPN to a Tree-LSTM model with discrete parsing actions. Its policy is then refined by Gumbel-Softmax training towards a semantically oriented objective. We evaluate our approach on the All Natural Language Inference dataset and show that it achieves a new state of the art in terms of parsing F -score, outperforming our base models, including the PRPN. 1

show abstract

“…For instance, Yogatama et al [27] used REINFORCE algorithms [28] to train the shiftreduce parser without ground truth. Instead of the shiftreduce parsers, Maillard et al [29] used a chart parser, which is fully differentiable by introducing a softmax annealing but suffers from O(n 3 ) time-and space-complexity. Gumbel Tree-LSTM is a parsing strategy proposed by [14], which introduces Tree-LSTM and calculates the merging score for each adjacent node pair based on a learnable query vector and greedily merges the best pair with the highest score in the next layer.…”

Section: Learning Tree Structures For Languagementioning

confidence: 99%

Learning to Compose and Reason with Language Tree Structures for Visual Grounding

Hong

Liu

et al. 2022

IEEE Trans. Pattern Anal. Mach. Intell.

111

View full text Add to dashboard Cite

Grounding natural language in images, such as localizing "the black dog on the left of the tree", is one of the core problems in artificial intelligence, as it needs to comprehend the fine-grained language compositions. However, existing solutions merely rely on the association between the holistic language features and visual features, while neglect the nature of composite reasoning implied in the language. In this paper, we propose a natural language grounding model that can automatically compose a binary tree structure for parsing the language and then perform visual reasoning along the tree in a bottom-up fashion. We call our model RVG-TREE: Recursive Grounding Tree, which is inspired by the intuition that any language expression can be recursively decomposed into two constituent parts, and the grounding confidence score can be recursively accumulated by calculating their grounding scores returned by the two sub-trees. RVG-TREE can be trained end-to-end by using the Straight-Through Gumbel-Softmax estimator that allows the gradients from the continuous score functions passing through the discrete tree construction. Experiments on several benchmarks show that our model achieves the state-of-the-art performance with more explainable reasoning.

show abstract

Jointly learning sentence embeddings and syntax with unsupervised Tree-LSTMs

Cited by 62 publications

References 24 publications

Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming

Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming

An Imitation Learning Approach to Unsupervised Parsing

Learning to Compose and Reason with Language Tree Structures for Visual Grounding

Contact Info

Product

Resources

About