“…Other methods for analyzing NLP models include (i) inspecting the mechanisms a model uses to encode information, e.g. attention weights (Voita et al, 2018;Raganato and Tiedemann, 2018;Voita et al, 2019b;Clark et al, 2019;Kovaleva et al, 2019) or individual neurons (Karpathy et al, 2015;Pham et al, 2016;Bau et al, 2019), (ii) looking at model predictions using manually defined templates, either evaluating sensitivity to specific grammatical errors (Linzen et al, 2016;Gulordava et al, 2018;Tran et al, 2018;Marvin and Linzen, 2018) or understanding what language models know when applying them as knowledge bases or in QA settings (Radford et al, 2019;Petroni et al, 2019;Poerner et al, 2019;Jiang et al, 2019). An information-theoretic view on analysis of NLP models has been previously attempted in Voita et al (2019a) when explaining how representations in the Transformer evolve between layers under different training objectives.…”