What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

Dalvi, Fahim; Durrani, Nadir; Sajjad, Hassan; Belinkov, Yonatan; Bau, Anthony; Glass, James

doi:10.1609/aaai.v33i01.33016309

Cited by 91 publications

(97 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They found the language model representations to consistently outperform those from NMT. In other work, we have found that language model representations are of similar quality to NMT ones in terms of POS and morphology, but are behind in terms of semantic tagging (Dalvi et al 2019a). Tenney et al (2019) compared representations from CoVE, ELMo, GPT, and BERT on a number of classiciation tasks, partially overlapping with the ones we study.…”

Section: Contextualized Word Representationsmentioning

confidence: 83%

“…We analyzed individual tags across layers and found that open class categories such as verbs and nouns are distributed across several layers, although the majority of the learning of these phenomena is still done at layer 1. Please refer to (Dalvi et al 2019a) for further information. Figure 6: Morphological tagging accuracy using representations from layers 1 to 4, taken from encoders and decoders of different language pairs.…”

Section: Effect Of Network Depthmentioning

confidence: 99%

See 1 more Smart Citation

On the Linguistic Representational Power of Neural Machine Translation Models

Belinkov

Durrani²,

Dalvi³

et al. 2020

Computational Linguistics

Self Cite

View full text Add to dashboard Cite

Despite the recent success of deep neural networks in natural language processing (NLP) and other spheres of artificial intelligence (AI), their interpretability remains a challenge. We analyze the representations learned by neural machine translation (NMT) models at various levels of granularity and evaluate their quality through relevant extrinsic properties. In particular, we seek answers to the following questions: (i) How accurately is word-structure captured within the learned representations, which is an important aspect in translating morphologically-rich languages? (ii) Do the representations capture long-range dependencies, and effectively handle syntactically divergent languages? (iii) Do the representations capture lexical semantics? We conduct a thorough investigation along several parameters: (i) Which layers in the architecture capture each of these linguistic phenomena; (ii) How does the choice of translation unit (word, character, or subword unit) impact the linguistic properties captured by the underlying representations? (iii) Do the encoder and decoder learn differently and independently? (iv) Do the representations learned by multilingual NMT models capture the same amount of linguistic information as their bilingual counterparts? Our data-driven, quantitative evaluation illuminates important aspects in NMT models and their ability to capture various linguistic phenomena. We show that deep NMT models trained in an end-to-end fashion, without being provided any direct supervision during the training process, learn a non-trivial amount of linguistic information. Notable findings include the following observations: i) Word morphology and part-of-speech information are captured at the lower layers of the model; (ii) In contrast, lexical semantics or non-local syntactic and semantic dependencies are better represented at the higher layers of the model; (iii) Representations learned using characters are more informed about wordmorphology compared to those learned using subword units; and (iv) Representations learned by multilingual models are richer compared to bilingual models.

show abstract

Section: Contextualized Word Representationsmentioning

confidence: 83%

Section: Effect Of Network Depthmentioning

confidence: 99%

On the Linguistic Representational Power of Neural Machine Translation Models

Belinkov

Durrani²,

Dalvi³

et al. 2020

Computational Linguistics

Self Cite

View full text Add to dashboard Cite

show abstract

“…For example, if the input was '000 t1 t2', we trained the classifier to predict 't1' for the encoder activations of the second time step and to predict 't2' for the activations of the third time step. Similarly to the methodology of Dalvi et al (2019), we subsequently added units to a set, depending on the absolute weight they were assigned in the diagnostic classifier. 4 After each addition, we re-calculated the accuracy for the prediction.…”

Section: Functional Groupsmentioning

confidence: 99%

“…However, unlikeDalvi et al (2019), we do not use any regularization on the DC to contrast the different degrees to which information is distributed across neurons in the two model types.5 Applying the methods development by Lundberg and Lee (2017) seems to confirm the responsible neurons we found, but selects more neurons and gives less consistent results, which we trace back to the extensive approximations required and some model assumptions (e.g. feature independence) being violated.…”

mentioning

confidence: 92%

On the Realization of Compositionality in Neural Networks

Joris¹,

Leible²,

Nikolaus³

et al. 2019

Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP

View full text Add to dashboard Cite

We present a detailed comparison of two types of sequence to sequence models trained to conduct a compositional task. The models are architecturally identical at inference time, but differ in the way that they are trained: our baseline model is trained with a task-success signal only, while the other model receives additional supervision on its attention mechanism (Attentive Guidance), which has shown to be an effective method for encouraging more compositional solutions . We first confirm that the models with attentive guidance indeed infer more compositional solutions than the baseline, by training them on the lookup table task presented by Liška et al. (2019). We then do an in-depth analysis of the structural differences between the two model types, focusing in particular on the organisation of the parameter space and the hidden layer activations and find noticeable differences in both these aspects. Guided networks focus more on the components of the input rather than the sequence as a whole and develop small functional groups of neurons with specific purposes that use their gates more selectively. Results from parameter heat maps, component swapping and graph analysis also indicate that guided networks exhibit a more modular structure with a small number of specialized, strongly connected neurons.

show abstract

“…We provide three methods to analyze neural network models in the toolkit: Individual and Cross-model Analysis, to search for neurons, important for the model itself (Bau et al 2019) and Linguistic Correlation Analysis, which identifies important neurons w.r.t. an extrinsic property (Dalvi et al 2019). The output of each method is a ranked list of neurons.…”

Section: Analysis Methodsmentioning

confidence: 99%

NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks

Dalvi

Nortonsmith

Bau

et al. 2019

AAAI

Self Cite

View full text Add to dashboard Cite

We present a toolkit to facilitate the interpretation and understanding of neural network models. The toolkit provides several methods to identify salient neurons with respect to the model itself or an external task. A user can visualize selected neurons, ablate them to measure their effect on the model accuracy, and manipulate them to control the behavior of the model at the test time. Such an analysis has a potential to serve as a springboard in various research directions, such as understanding the model, better architectural choices, model distillation and controlling data biases. The toolkit is available for download. 1

show abstract

What Is One Grain of Sand in the Desert? Analyzing Individual Neurons in Deep NLP Models

Cited by 91 publications

References 13 publications

On the Linguistic Representational Power of Neural Machine Translation Models

On the Linguistic Representational Power of Neural Machine Translation Models

On the Realization of Compositionality in Neural Networks

NeuroX: A Toolkit for Analyzing Individual Neurons in Neural Networks

Contact Info

Product

Resources

About