Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.356
|View full text |Cite
|
Sign up to set email alerts
|

How LSTM Encodes Syntax: Exploring Context Vectors and Semi-Quantization on Natural Text

Abstract: Long Short-Term Memory recurrent neural network (LSTM) is widely used and known to capture informative long-term syntactic dependencies. However, how such information are reflected in its internal vectors for natural text has not yet been sufficiently investigated. We analyze them by learning a language model where syntactic structures are implicitly given. We empirically show that the context update vectors, i.e. outputs of internal gates, are approximately quantized to binary or ternary values to help the la… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…Experiments suggest that LSTMs trained on synthetic tasks learn to implement counter memory (Weiss et al, 2018;Suzgun et al, 2019a), and that they fail on tasks requiring stacks and other deeper models of structure (Suzgun et al, 2019b). Similarly, Shibata et al (2020) found that LSTM language models trained on natural language data acquire saturated representations approximating counters.…”
Section: Nlp and Formal Language Theorymentioning
confidence: 90%
“…Experiments suggest that LSTMs trained on synthetic tasks learn to implement counter memory (Weiss et al, 2018;Suzgun et al, 2019a), and that they fail on tasks requiring stacks and other deeper models of structure (Suzgun et al, 2019b). Similarly, Shibata et al (2020) found that LSTM language models trained on natural language data acquire saturated representations approximating counters.…”
Section: Nlp and Formal Language Theorymentioning
confidence: 90%
“…Phonology To study whether our models learn phonologically meaningful representations, we study our high-dimensionality hidden representation for each item of our vocabulary, as suggested in Madsen et al (2021). We reduce the dimensionality of our encoded representations using PCA (Pearson, 1901) andt-SNE (der Maaten andHinton, 2008) and look at the emerging underlying organisation of the phonetic space, as was done in Jacobs and Mailhot (2019) and Shibata et al (2020) for, respectively, seq2seq phonetic and LSTM syntactic representations analysis.…”
Section: Synchronic Probesmentioning
confidence: 99%
“…Similarly, they cannot reliably reverse strings (Hao et al, 2018;Merrill, 2019). Shibata et al (2020) show that LSTM language models trained on natural language acquire semi-saturated representations where the gates tightly cluster around discrete values. Thus, sLSTMs appear to be a promising formal model of the counting behavior of LSTMs on both synthetic and natural tasks.…”
Section: Saturated Network As Automatamentioning
confidence: 99%