2001
DOI: 10.1109/72.963769
|View full text |Cite
|
Sign up to set email alerts
|

LSTM recurrent networks learn simple context-free and context-sensitive languages

Abstract: Previous work on learning regular languages from exemplary training sequences showed that long short-term memory (LSTM) outperforms traditional recurrent neural networks (RNNs). We demonstrate LSTMs superior performance on context-free language benchmarks for RNNs, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM variants are also the first RNNs to learn a simple context-sensitive language, namely a(n)b(n)c(n).

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
271
0
1

Year Published

2003
2003
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 577 publications
(294 citation statements)
references
References 19 publications
2
271
0
1
Order By: Relevance
“…It is possible to evolve good problemspecific topologies (Bayer et al, 2009). Some LSTM variants also use modifiable self-connections of CECs (Gers and Schmidhuber, 2001).…”
Section: : Supervised Recurrent Very Deep Learner (Lstm Rnn)mentioning
confidence: 99%
“…It is possible to evolve good problemspecific topologies (Bayer et al, 2009). Some LSTM variants also use modifiable self-connections of CECs (Gers and Schmidhuber, 2001).…”
Section: : Supervised Recurrent Very Deep Learner (Lstm Rnn)mentioning
confidence: 99%
“…Together with recent parallel or subsequent work (e.g. Bodén and Wiles, 2000;Gers and Schmidhuber, 2001;Rodriguez, 2001) on learning of MA and similar tasks our study might therefore contribute to a better understanding of the capabilities of recurrent neural networks to process natural language. While our first experimental results were announced in (Chalup and Blair, 1999) we now provide a detailed exposition of our investigation.…”
Section: ------------------------------------------------------------mentioning
confidence: 99%
“…With the aim of investigating learnability of the one-step look-ahead prediction task for non-regular languages we performed experiments using training sequences formed by strings of one of the following types, where always n ≥ 1: Since our interest is focused on the language of multiple agreements MA = {s n ; n ≥ 1} we will from now on use strings s n = a Since the depth n is not known to the network at the start of a string it cannot predict when the first b will occur and logically it cannot know how many a's 3 Some authors have applied a more rigid interpretation, insisting that the output for the predicted symbol must be above a fixed threshold and that the outputs for all other symbols must be below that threshold (e. g. Rodriguez et al, 1999;Bodén and Wiles, 2000;Gers and Schmidhuber, 2001). …”
Section: Prediction Taskmentioning
confidence: 99%
See 2 more Smart Citations