Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) 2018
DOI: 10.18653/v1/p18-2117
|View full text |Cite
|
Sign up to set email alerts
|

On the Practical Computational Power of Finite Precision RNNs for Language Recognition

Abstract: While Recurrent Neural Networks (RNNs) are famously known to be Turing complete, this relies on infinite precision in the states and unbounded computation time. We consider the case of RNNs with finite precision whose computation time is linear in the input length. Under these limitations, we show that different RNN variants have different computational power. In particular, we show that the LSTM and the Elman-RNN with ReLU activation are strictly stronger than the RNN with a squashing activation and the GRU. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
182
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 165 publications
(187 citation statements)
references
References 15 publications
4
182
1
Order By: Relevance
“…14 LSTM are more powerful than GRU networks, as they are able to learn a counting mechanism. 89 Combined with a simple hill-climb algorithm for optimization, which is an off-policy policy gradient algorithm with binary rewards and can also be interpreted as iterative fine-tuning, LSTM has recently been shown to perform as well as more sophisticated reinforcement learning algorithms such as proximal policy optimization (PPO) or advantage actor critic (A2C). 49 The model used was an LSTM with 3 layers of hidden size of 1024.…”
Section: Smiles Lstmmentioning
confidence: 99%
“…14 LSTM are more powerful than GRU networks, as they are able to learn a counting mechanism. 89 Combined with a simple hill-climb algorithm for optimization, which is an off-policy policy gradient algorithm with binary rewards and can also be interpreted as iterative fine-tuning, LSTM has recently been shown to perform as well as more sophisticated reinforcement learning algorithms such as proximal policy optimization (PPO) or advantage actor critic (A2C). 49 The model used was an LSTM with 3 layers of hidden size of 1024.…”
Section: Smiles Lstmmentioning
confidence: 99%
“…Also related is the work of Weiss et al (2018), who demonstrate that LSTMs are able to count infinitely, since their cell states are unbounded, while GRUs cannot count infinitely since the activations are constrained to a finite range. One avenue of future work could compare the performance of LSTMs and GRUs on the memorization task.…”
Section: Related Workmentioning
confidence: 99%
“…However, there still remain some fundamental questions regarding the practical computational expressivity of RNNs with finite precision. Weiss et al (2018) have recently demonstrated that Long Short-Term Memory (LSTM) models (Hochreiter and Schmidhuber, 1997), a popular variant of RNNs, can, theoretically, emulate a simple real-time k-counter machine, which can be described as a finite state controller with k separate counters, each containing integer values and capable of manipulating their content by adding ±1 or 0 at each time step (Fischer et al, 1968). The authors further tested their theoretical result by training the LSTM networks to learn a n b n and a n b n c n .…”
Section: Introductionmentioning
confidence: 99%