Long short-term memory (LSTM; Hochreiter & Schmidhuber, 1997) can solve numerous tasks not solvable by previous learning algorithms for recurrent neural networks (RNNs). We identify a weakness of LSTM networks processing continual input streams that are not a priori segmented into subsequences with explicitly marked ends at which the network's internal state could be reset. Without resets, the state may grow indefinitely and eventually cause the network to break down. Our remedy is a novel, adaptive "forget gate" that enables an LSTM cell to learn to reset itself at appropriate times, thus releasing internal resources. We review illustrative benchmark problems on which standard LSTM outperforms other RNN algorithms. All algorithms (including LSTM) fail to solve continual versions of these problems. LSTM with forget gates, however, easily solves them, and in an elegant way.
The evidence for isochrony of stress timing is weak at best for ordinary prose, but this does not mean that the timing of stresses is always unaffected by global constraints. We asked subjects to continually repeat the phrase Take a pack of cards and to temporally align the words take and cards with an auditorily presented stimulus consisting of just the words take and cards repeated several times. The phase of the cards stimulus relative to a reference cycle defined by the take-take interval was varied over the range 0.3-0.65 in eight equal-sized phase steps. The distribution of actually produced phases for the vowel onset of the syllable cards, however, was strongly trimodal. Subjects showed a powerful preference for phases close to 0.5, and somewhat weaker preferences for phases near 0.36 and 0.6. These values are close to (although systematically different from) 0.33 and 0.66 predicted by a simple harmonic model for stress timing. The observed distribution had this form whether the subjects were speaking along with the stimulus, or trying to maintain the prescribed timing after cessation of the stimulus. Furthermore, the observed phase was influenced by the phase produced on previous trials, suggesting dynamic control with hysteresis between competing stable patterns of timing. These results demonstrate strong rhythmic constraints on the timing of stresses within a phrase, where the domain of 'phrase' in this artificial speaking task is simply the repeated text. The rhythmic constraints are similar to those observed for limb movements. Modeling these constraints should provide insight into the form of a general dynamic control regime for global speech timing, and may allow improved characterization of 'natural' timing patterns in English speech.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.