Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders

Drozdov, Andrew; Verga, Pat; Yadav, Mohit; Iyyer, Mohit; McCallum, Andrew

doi:10.48550/arxiv.1904.02142

Cited by 4 publications

(6 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It is noteworthy that there are many other important miscellaneous works we do not mention in the previous sections. For example, numerous works have proposed to improve upon vanilla gradient-based methods [174,178,65]; linguistic rules such as negation, morphological inflection can be extracted by neural models [141,142,158]; probing tasks can used to explore linguistic properties of sentences [3,80,43,75,89,74,34]; the hidden state dynamics in recurrent nets are analysed to illuminate the learned long-range dependencies [73,96,67,179,94]; [169,166,168,101,57,167] studied the ability of neural sequence models to induce lexical, grammatical and syntactic structures; [91,90,12,136,159,24,151,85] modeled the reasoning process of the model to explain model behaviors; [157,139,28,163,219,170,180,137,106,58,162,81...…”

Section: Miscellaneousmentioning

confidence: 99%

Interpreting Deep Learning Models in Natural Language Processing: A Review

Sun¹,

Yang²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

Neural network models have achieved state-of-the-art performances in a wide range of natural language processing (NLP) tasks. However, a long-standing criticism against neural network models is the lack of interpretability, which not only reduces the reliability of neural NLP systems but also limits the scope of their applications in areas where interpretability is essential (e.g., health care applications). In response, the increasing interest in interpreting neural NLP models has spurred a diverse array of interpretation methods over recent years. In this survey, we provide a comprehensive review of various interpretation methods for neural models in NLP. We first stretch out a high-level taxonomy for interpretation methods in NLP, i.e., training-based approaches, test-based approaches and hybrid approaches. Next, we describe sub-categories in each category in detail, e.g., influence-function based methods, KNN-based methods, attention-based models, saliency-based methods, perturbation-based methods, etc. We point out deficiencies of current methods and suggest some avenues for future research.

show abstract

Section: Miscellaneousmentioning

confidence: 99%

Interpreting Deep Learning Models in Natural Language Processing: A Review

Sun¹,

Yang²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Shen et al (2019) designed a novel recurrent architecture to automatically capture the latent tree structure of an input sentence. Other studies learned syntactic parsers (Drozdov et al 2019;Htut, Cho, and Bowman 2019;Kitaev, Cao, and Klein 2019;Li, Mou, and Keller 2019;Li and Eisner 2019;Mrini et al 2019), although these methods pursued a high parsing accuracy, instead of explaining the DNN. Essentially, the learning of the syntactic parser aimed to make the parser fit syntactic structures defined by human experts.…”

Section: Related Workmentioning

confidence: 99%

“…In recent years, explaining features encoded inside a DNN has become an emerging direction. Based on the inherent hierarchical structure of natural language, many methods use latent tree structures of language to guide the DNN to learn interpretable feature representations (Choi, Yoo, and Lee 2018;Drozdov et al 2019;Shen et al 2018Shen et al , 2019Shi et al 2018;Tai, Socher, and Manning 2015;Wang, Lee, and Chen 2019;Yogatama et al 2016). However, the interpretability usually conflicts with the discrimination power (Bau et al 2017).…”

Section: Introductionmentioning

confidence: 99%

Building Interpretable Interaction Trees for Deep NLP Models

Zhang

Zhou

Zhang

et al. 2021

AAAI

View full text Add to dashboard Cite

This paper proposes a method to disentangle and quantify interactions among words that are encoded inside a DNN for natural language processing. We construct a tree to encode salient interactions extracted by the DNN. Six metrics are proposed to analyze properties of interactions between constituents in a sentence. The interaction is defined based on Shapley values of words, which are considered as an unbiased estimation of word contributions to the network prediction. Our method is used to quantify word interactions encoded inside the BERT, ELMo, LSTM, CNN, and Transformer networks. Experimental results have provided a new perspective to understand these DNNs, and have demonstrated the effectiveness of our method.

show abstract

“…However, these methods cannot learn simple grammar and meaningful semantics, though they perform well on NLI tasks [12]. Additionally, several approaches [13]- [15] aim to learn unsupervised parse trees; however, they perform poorly on end task. In this paper, we demonstrate that our approach can capture both grammar and semantics in the sentence, thus learning better parse trees and outperform RvNN-based model on several tasks.…”

Section: Related Workmentioning

confidence: 99%

“…Moreover, as we will show in our experimental section, the grammar or meta-level association thus detected by their approach is relatively trivial. Recently, [13] proposed an unsupervised latent chart tree parsing algorithm, viz., DIORA, that uses the inside-outside algorithm for parsing and has an autoencoder-based neural network trained to reconstruct the input sentence. DIORA is trained end to end using masked language model via word prediction.…”

Section: Related Workmentioning

confidence: 99%

Unsupervised Learning of Explainable Parse Trees for Improved Generalisation

Sahay¹,

Maheshwari²,

Kumar³

et al. 2021

Preprint

View full text Add to dashboard Cite

Recursive neural networks (RvNN) have been shown useful for learning sentence representations and helped achieve competitive performance on several natural language inference tasks. However, recent RvNN-based models fail to learn simple grammar and meaningful semantics in their intermediate tree representation. In this work, we propose an attention mechanism over Tree-LSTMs to learn more meaningful and explainable parse tree structures. We also demonstrate the superior performance of our proposed model on natural language inference, semantic relatedness, and sentiment analysis tasks and compare them with other state-of-the-art RvNN based methods. Further, we present a detailed qualitative and quantitative analysis of the learned parse trees and show that the discovered linguistic structures are more explainable, semantically meaningful, and grammatically correct than recent approaches. The source code of the paper is available here. 1

show abstract

Unsupervised Latent Tree Induction with Deep Inside-Outside Recursive Autoencoders

Cited by 4 publications

References 0 publications

Interpreting Deep Learning Models in Natural Language Processing: A Review

Interpreting Deep Learning Models in Natural Language Processing: A Review

Building Interpretable Interaction Trees for Deep NLP Models

Unsupervised Learning of Explainable Parse Trees for Improved Generalisation

Contact Info

Product

Resources

About