Felix A. Gers scite author profile

Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel, adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them, and in an elegant way.

show abstract

LSTM recurrent networks learn simple context-free and context-sensitive languages

Gers

Schmidhuber

2001

IEEE Trans. Neural Netw.

577

271

View full text Add to dashboard Cite

Previous work on learning regular languages from exemplary training sequences showed that long short-term memory (LSTM) outperforms traditional recurrent neural networks (RNNs). We demonstrate LSTMs superior performance on context-free language benchmarks for RNNs, and show that it works even better than previous hardwired or highly specialized architectures. To the best of our knowledge, LSTM variants are also the first RNNs to learn a simple context-sensitive language, namely a(n)b(n)c(n).

show abstract

Learning to forget: continual prediction with LSTM

1999

View full text Add to dashboard Cite

Recurrent nets that time and count

2000

View full text Add to dashboard Cite

Applying LSTM to Time Series Predictable through Time-Window Approaches

2001

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Felix A. Gers

Learning to Forget: Continual Prediction with LSTM

LSTM recurrent networks learn simple context-free and context-sensitive languages

Learning to forget: continual prediction with LSTM

Recurrent nets that time and count

Applying LSTM to Time Series Predictable through Time-Window Approaches

Contact Info

Product

Resources

About